Exploratory Data Analysis of College Tuition, Financial Support, and Campus Demographics among American Universities

Brian Perez Joseph
Posted on Feb 2, 2021

According to the National Center for Education Statistics, the college enrollment rate for students between the ages of 18 and 24 in 2018 was 41%, which is one of the highest enrollment rates observed throughout the history of American colleges and universities. Students are choosing to enroll in these post-secondary institutions at higher rates than ever before; even amidst the COVID-19 pandemic, the number of students projected to enroll in the Fall 2020 school year was approximately 19.7 million students, which is still consistent with the enrollment totals of previous years. Overall, not only are the rates of enrollment higher than they have ever been, the pacing of enrollment does not seem to be slowing down, even in the face of the pandemic, highlighting the growing shift towards college education in the modern job market.

In today's job market, a Bachelor's degree is often a minimum requirement for many entry-level positions; in fact, future career advancement in a given industry may require higher levels of certification or an advanced degree. The emphasis on higher accreditations means that the an undergraduate education is a often requirement for entry into a wide range of professions. Because colleges and universities play such a crucial role in career advancement, understanding some of the trends with respect to tuition and group demographics might give insights as to how students approach entering into the college system. Therefore, the goal of this project was to conduct an exploratory data analysis on college tuition, financial support, and campus demographics across American colleges and universities.

Economic Trends Across American Colleges & Universities

Nationwide Annual Trends in Total Tuition, Student Cost, and Financial Support

The college tuition data was taken from tuitiontracker.org and covers the total tuition price and median student cost from the years 2010 to 2018. The dataset comprises the annual tuition cost for nearly 3000 American colleges and universities in the public, private, and for-profit sectors.

Figure 1: Line Plot of Annual Median Total Tuition Price from 2010 to 2018

In Figure 1, the line plot shows the median total tuition costs for the three school types: for-profit, private, and public. The blue segment represents the total tuition costs for private schools; the red segment represents for-profits schools; the green segment represents public schools. Overall, private schools across the United States showed the highest increase in the median total tuition cost with an increase of $11,138 across the span of eight years along with the highest overall cost across all three institution types. For-profit schools had the second highest total cost over the span over the span of eight years with an increase of $3,550 over the time duration. Public schools had the lowest overall total tuition cost with the median total cost of $3,861.

Figure 2: Line Plot of the Median Annual Student Cost from 2010 to 2018

In Figure 2, the line plot shows the median student cost--the amount of tuition paid directly by the student--across different institution types from 2010 to 2018. Overall, for-profit schools had the highest median student cost ranging from $21,111 to $25,406 over the eight year span. The median student cost in private schools ranged from $19,611 to $22,336, and the median student cost in public schools ranged from $8,854 to $10,550 over the eight year span.

Figure 3: Line Plot of the Percentage of Tuition from 2010 to 2018

Figure 3 shows the a line plot of the median financial support across institution types from 2010 to 2018. The financial coverage was calculated as a percentage of the total tuition cost paid for through tuition assistance programs. Overall, private schools showed the highest financial support ranging from 43% to 51% financial coverage over the eight year span. Public schools showed the second highest support ranging from 39% to 45% financial coverage over the eight year span. For-profit schools offered the lowest coverage with financial support ranging from 19% to 32% over the span of the eight years.

Tuition Costs and Financial Support Across Income Levels

Figure 4: Dot plot of the median student costs across five different student income levels in the 2018 academic school year

Figure 4 shows a dot plot of the median student cost across varying income levels in the year 2018. Among private institutions, the median student cost gradually increases as the income level increases with a median cost at $18,606 at the lowest income level (below $30,000) and a cost of $27,919 at the highest income level (over $100,000). Public institutions followed a similar trend as private institutions where the median cost increases along with income level with a median cost of $8,178 at the lowest income level and a cost of $13,648 at the highest income level. For-profit institutions show no clear pattern with respect to increasing income level with median costs ranging from $19,887 to $24,439.

Figure 5: Dot plot of the median financial support across five different student income levels in the 2018 academic school year

Figure 5 shows a dot plot of the median financial support across income levels in the 2018 academic school year. For private schools, as the income level increased, the amount of financial support decreases, beginning with $27,494 at the lowest income level and decreasing towards $18,180 at the highest income level. Similarly, public schools followed a similar pattern where the highest support was found at the lowest income level with a median support of $11,762 and gradually decreasing towards $6,292 at the highest income level. Once again, for-profit schools did not show any discernable pattern regarding income level with median financial support ranging from $12,358 to $7,806 across income levels.

Demographic Trends Across American College & Universities in 2014

The demographic data on college campuses was taken from the Chronicle of Higher Education that examined the number of enrolled students in the 2014 academic school year with respect to their gender, race, and ethnicity across 4,725 American institutions.

Enrollment Proportions by Demographic Group Across Institution Type

Figure 6: Stacked bar plot of the proportion of enrolled students by demographic group color-coded by institution type

Figure 6 shows a stacked bar plot of the proportion of enrolled students across different demographic groups stacked by the institution type in the year 2014. Across all demographic groups, the majority of students (>50%) within each group enrolled in public institutions. With the exception of non-resident foreigners, each demographic group had approximately 70% enrollment or higher in public institutions. Private institutions enrollment rates ranged from 15% to 46% across demographic groups. For-profit enrollment rates were the lowest of the three, having percentages less than 1% across all groups.

Tuition Dollars Spent by Demographic Groups Across Institution Types

Figure 7 shows a bar plot of the proportion of tuition paid by each demographic group segmented by institution type. Similar to the trend in previous bar plot, across all demographic groups, the majority proportion of tuition spent went towards public schools followed by private and for-profit schools.

Overall Analysis & Conclusions

From the exploratory data analysis, one general trend that was observed regarding tuition was the continual increase across each academic school year. Among private schools nationwide, the total tuition increased faster and steeper than in public and for-profit schools across the eight year span. Public and for-profit had similar tuition increases over time, but public schools had the lowest overall total tuition across the eight year span. In looking at student cost and financial support, although private schools had the highest total tuition prices, they also provided the most financial support among the three institution types, covering nearly 50% percent of tuition by 2018. However, the financial support decreases as the student's income begins to increase. Moreover, when looking at the demographic trends in 2014, most demographic groups overwhelmingly enrolled in public schools versus private and for-profit schools. Public schools also received the greater share of tuition dollars spent across each demographic group.

Overall, from the data, one major trend occurring across American colleges and universities is that the people across most major demographic groups opt for public colleges and universities in pursuing a higher degree. This trend may be due to the fact that of public institutions are more accessible due to their relative affordability and low barriers of entry. As a result, American public institutions host a much more diverse student population than private or for-profit institutions; however, one area for future study would be to examine additional factors such as retention rates, graduation rates, choice of major, age, and family status to better characterize students enrolling in colleges and gain an understanding of their academic journey.

Another major trend is that the overall cost of an education is increasing along all sectors with no indications of dip in price. The reasons for the tuition increases go beyond the scope of this dataset; however, two areas to examine on a future analysis would be to look at the rise in tuition with the income levels across America and the earning salaries of college/university graduates to examine whether the rise in tuition coincides with American household salaries having improved in the eight year time span or whether career salaries for college graduates having increased over time.

Links & Sources

This exploratory data analysis was also visualized through an R Shiny dashboard, which can be can be accessed down below along with the code for the dashboard on GitHub. The data for this project was taken from the R for Data Science community webpage and is also linked below.

Shiny Dashboard

GitHub Repository Code

College Tuition & Demographic Data Sources

About Author

Brian Perez Joseph

Brian Perez Joseph

With a background in biomedical research and data science, Brian aims to utilize his quantitative background in the sciences and data programming skills to provide data-driven decision making strategies and key insights for real-world business problems.
View all posts by Brian Perez Joseph >

Leave a Comment

No comments found.

View Posts by Categories


Our Recent Popular Posts


View Posts by Tags

#python #trainwithnycdsa 2019 airbnb Alex Baransky alumni Alumni Interview Alumni Reviews Alumni Spotlight alumni story Alumnus API Application artist aws beautiful soup Best Bootcamp Best Data Science 2019 Best Data Science Bootcamp Best Data Science Bootcamp 2020 Best Ranked Big Data Book Launch Book-Signing bootcamp Bootcamp Alumni Bootcamp Prep Bundles California Cancer Research capstone Career Career Day citibike clustering Coding Course Demo Course Report D3.js data Data Analyst data science Data Science Academy Data Science Bootcamp Data science jobs Data Science Reviews Data Scientist Data Scientist Jobs data visualization Deep Learning Demo Day Discount dplyr employer networking feature engineering Finance Financial Data Science Flask gbm Get Hired ggplot2 googleVis Hadoop higgs boson Hiring hiring partner events Hiring Partners Industry Experts Instructor Blog Instructor Interview Job Job Placement Jobs Jon Krohn JP Morgan Chase Kaggle Kickstarter lasso regression Lead Data Scienctist Lead Data Scientist leaflet linear regression Logistic Regression machine learning Maps matplotlib Medical Research Meet the team meetup Networking neural network Neural networks New Courses nlp NYC NYC Data Science nyc data science academy NYC Open Data NYCDSA NYCDSA Alumni Online Online Bootcamp Online Training Open Data painter pandas Part-time Portfolio Development prediction Prework Programming PwC python python machine learning python scrapy python web scraping python webscraping Python Workshop R R language R Programming R Shiny r studio R Visualization R Workshop R-bloggers random forest Ranking recommendation recommendation system regression Remote remote data science bootcamp Scrapy scrapy visualization seaborn Selenium sentiment analysis Shiny Shiny Dashboard Spark Special Special Summer Sports statistics streaming Student Interview Student Showcase SVM Switchup Tableau team TensorFlow Testimonial tf-idf Top Data Science Bootcamp twitter visualization web scraping Weekend Course What to expect word cloud word2vec XGBoost yelp