Cost Benefit Exploratory Visualization Analysis of Public vs Private Institutions

Nelson Chen
Posted on Oct 24, 2016


In 2008, I was a senior in high school applying to colleges around the country, eager to start the next phase of my life. Unfortunately, my application cycle fell right in the middle of our recent recession caused by the collapse of the housing market in 2006. Although I had gained admission to Northwestern University, a prestigious private school, I had to decide if it was worth spending my parent’s life savings as well as taking out large loans. My other choice was my native state’s public school, State University of New York (SUNY), Binghamton. Although SUNY Binghamton was not as prestigious, it would have cost me 4 times less, and majorly reduced the financial burden on my family. Ultimately, I had to choose between an expensive private college, or go a cheaper public college. I ended up going to Northwestern University, but if I had more data on the differences, I might have chose differently.

It is not uncommon for students to have to choose between more prestigious private schools and cheaper public schools. However, as college tuition, student debt, and the need for a college degree are on the rise, it is becoming ever more important to choose carefully. In fact, regarding the student debt bubble, billionaire entrepreneur Mark Cuban has said we are “going to see a repeat of what we saw in the housing market...”1. Furthermore, according to, the cumulative U.S. student debt is over $1.45 trillion dollars, more than the total credit card and auto debt in the country.

In this blog, we will do an exploratory analysis of the data released by the U.S. Department of Education and look at the costs and benefits between public and private colleges. This blog will focus primarily on predominately bachelor’s degree granting schools and the latest available year’s data (2013). This analysis will focus on the cost, debt, and earning aspect. We will see a U.S. map of the cost and adjusted cost, density plots of the median debt and median earnings after graduation, and lastly a scatter plot of net cost vs. median earning.


U.S. Map of Cost and Adjusted Net Cost

public private

 The cost of attendance is the college’s reported estimate of total cost needed per year, this includes tuition, living expenses, fees, etc. The public school map is dominated by green and yellow points ($10,000 to $30,000), whereas the the private school map is dominated by orange and red points ($30,000+). It seems that private colleges are roughly about $20,000 more expensive. In the above plots, we confirm that most private institutions are more expensive than public institutions.


However, when we plot the adjusted net cost (cost of attendance minus average grants and scholarships) we can see the difference in cost is actually smaller, where the private institutions provide more financial aid but still ultimately cost more. The public school map is dominated by blue and green dots ($0 to $20,000), whereas the private school map is dominated by green and yellow dots ($10,000 to $30,000). Roughly speaking, the public institutions give about $10,000 aid, whereas private institutions give about $20,000 aid.


Density Graphs of Debt and Median Earning


In the median debt density graph, the public and private graphs have different peaks and a portion that overlaps. From this graph, it seems that most of the median debt from public schools are less than private schools. This makes sense since in the previous U.S. maps, private school costs more, so logically students will have to take out more loans to pay for tuition.


Surprisingly in the median density graphs, the median earnings (10 years after graduation) between public and private schools have little visual difference. The density graphs seem to fall on top one of another, with the peaks almost aligned, but the private density graph has a little more variance. It seems that regardless of private or public schools, in general, the earnings are about the same.


Scatter Plot of Net Cost vs. Median Earning


In order to get a better understanding of how cost and earnings are related to each other for each school, it is desirable to show a scatter plot of net cost and median earning. Here it seems that the public and private institutions form their own clusters. With the public school cluster being cheaper than the private school cluster, but at about the same earning level. However, there is also a small portion of these two clusters that overlap.


To understand the distributions a little more clearly, 2D density contours are overlaid on top of the scatter points to illustrate where the highest density regions are. The innermost contour line shows the densest region of each group. Here it can be seen that even though the two college types may overlap, the peak of each group are separated.



In general, the cost of private institutions are roughly on the order of $20,000 more than expensive than public schools. However, private institutions give on the order of $10,000 more financial aid, resulting in private schools only be on the order of $10,000 greater in net cost. On the other side of the analysis, in general, students tend to leave private institutions with more debt, but earn about the same amount after graduation. Finally, net cost does not seem to generate more earnings, which results in public schools being cheaper but earning around the same as private schools. With all else being, it is recommended to go to public schools to save money.


Future Steps
In this analysis, we focused on a high level overview of whether the monetary investment in more expensive colleges (private institutions) is worth it based on how much debt and earnings one comes out with after graduation. However, college is more than just money in and money out, there are many other factors that define a good and worthwhile investment. These factors may include, faculty to student ratio, school size, location, types of programs and many others. In the future, a deeper analysis will be done to include these intangible factors. In addition, this study only looked at predominately bachelor’s degree granting institution, it would be enlightening to see how schools that grant different level of degrees such as associate degrees, medical degrees, etc. play into earnings and investment returns.

About Author

Nelson Chen

Nelson Chen

Nelson has a Bachelor's degree from Northwestern University and a Master's degree from University of California, Berkeley in Mechanical Engineering. His graduate work specialized in developing and applying new Computational Fluid Dynamic algorithms to astrophysical fluid dynamic problems...
View all posts by Nelson Chen >

Leave a Comment

No comments found.

View Posts by Categories

Our Recent Popular Posts

View Posts by Tags

#python #trainwithnycdsa 2019 airbnb Alex Baransky alumni Alumni Interview Alumni Reviews Alumni Spotlight alumni story Alumnus API Application artist aws beautiful soup Best Bootcamp Best Data Science 2019 Best Data Science Bootcamp Best Data Science Bootcamp 2020 Best Ranked Big Data Book Launch Book-Signing bootcamp Bootcamp Alumni Bootcamp Prep Bundles California Cancer Research capstone Career Career Day citibike clustering Coding Course Demo Course Report D3.js data Data Analyst data science Data Science Academy Data Science Bootcamp Data science jobs Data Science Reviews Data Scientist Data Scientist Jobs data visualization Deep Learning Demo Day Discount dplyr employer networking feature engineering Finance Financial Data Science Flask gbm Get Hired ggplot2 googleVis Hadoop higgs boson Hiring hiring partner events Hiring Partners Industry Experts Instructor Blog Instructor Interview Job Job Placement Jobs Jon Krohn JP Morgan Chase Kaggle Kickstarter lasso regression Lead Data Scienctist Lead Data Scientist leaflet linear regression Logistic Regression machine learning Maps matplotlib Medical Research Meet the team meetup Networking neural network Neural networks New Courses nlp NYC NYC Data Science nyc data science academy NYC Open Data NYCDSA NYCDSA Alumni Online Online Bootcamp Online Training Open Data painter pandas Part-time Portfolio Development prediction Prework Programming PwC python Python Data Analysis python machine learning python scrapy python web scraping python webscraping Python Workshop R R language R Programming R Shiny r studio R Visualization R Workshop R-bloggers random forest Ranking recommendation recommendation system regression Remote remote data science bootcamp Scrapy scrapy visualization seaborn Selenium sentiment analysis Shiny Shiny Dashboard Spark Special Special Summer Sports statistics streaming Student Interview Student Showcase SVM Switchup Tableau team TensorFlow Testimonial tf-idf Top Data Science Bootcamp twitter visualization web scraping Weekend Course What to expect word cloud word2vec XGBoost yelp