Examining Higher Education in the United States with the College Scorecard Dataset

Posted on Oct 15, 2017

The Department of Education collects a great deal of data from colleges and universities across the country and releases this data annually on a site called College Scorecard. The data is intended to help prospective college students and their families make decisions about which colleges are best for them.

I used R Shiny to create a tool which allows the College Scorecard data to be visualized via methods other than the ones already available on the College Scorecard site. My hope is that it will provide useful information about higher education in the United States both to those looking to find the right college and to those interested in trends in higher education as a whole.

I used the College Scorecard data from the school year 2014-2015. I restricted my analysis to institutions which offer bachelor's degrees and which provided data for all of the metrics I examined.

Questions for Analysis

  • Of the three types of institutions (private, public, and for-profit), which offers the best value for its students?
  • How do the institutions in each state compare to one another?

Shiny Application
I created a Shiny application to help the analysis by allowing the data to be visualized in four different ways-- a data table view of the entire data set, density plots of single user-chosen variables, scatter plots of two user-chosen variables, and a geographic plot using Leaflet of state averages a user chosen variable. The data table view is useful for examining single data points and for finding the best and worst institutions for a given metric via the 'sort by column' feature. The single variable density plots reveal the approximate probability distribution for each variable and type of institution. The two variable scatter plots reveals the interaction between pairs of variables. The geographic plots show the mean value of the chosen variable by state.

The variables available to view are:

  • Admission rate
  • Average family income
  • Median family income
  • Default rate
  • 3 year repayment rates for students from low, middle, and high income families
  • Median debt
  • Number of students


Examining the single variable plots for default rate and median debt, we see that the distribution of values for for-profit schools has a fatter right tail than those for public and private schools.Β  We also note that there are many private schools whose median amount of debt is between $25,000 and $30,000.

The two variable scatter plots revealed connections between the default rate and both the median debt and median family income of students.Β  We notice that beyond a threshold at around $27,000 of median debt, there are very few schools with low default rates.

The geographic plot reveals that colleges and universities in southern states have a higher average default rate than those in other states.

Thus the visualizations in my Shiny application allowed me to notice the high default rates and median debts in a segment of the for-profit schools, as well as the geographic trend in default rates.

About Author

Related Articles

Leave a Comment

No comments found.

View Posts by Categories

Our Recent Popular Posts

View Posts by Tags

#python #trainwithnycdsa 2019 2020 Revenue 3-points agriculture air quality airbnb airline alcohol Alex Baransky algorithm alumni Alumni Interview Alumni Reviews Alumni Spotlight alumni story Alumnus ames dataset ames housing dataset apartment rent API Application artist aws bank loans beautiful soup Best Bootcamp Best Data Science 2019 Best Data Science Bootcamp Best Data Science Bootcamp 2020 Best Ranked Big Data Book Launch Book-Signing bootcamp Bootcamp Alumni Bootcamp Prep boston safety Bundles cake recipe California Cancer Research capstone car price Career Career Day citibike classic cars classpass clustering Coding Course Demo Course Report covid 19 credit credit card crime frequency crops D3.js data data analysis Data Analyst data analytics data for tripadvisor reviews data science Data Science Academy Data Science Bootcamp Data science jobs Data Science Reviews Data Scientist Data Scientist Jobs data visualization database Deep Learning Demo Day Discount disney dplyr drug data e-commerce economy employee employee burnout employer networking environment feature engineering Finance Financial Data Science fitness studio Flask flight delay gbm Get Hired ggplot2 googleVis H20 Hadoop hallmark holiday movie happiness healthcare frauds higgs boson Hiring hiring partner events Hiring Partners hotels housing housing data housing predictions housing price hy-vee Income Industry Experts Injuries Instructor Blog Instructor Interview insurance italki Job Job Placement Jobs Jon Krohn JP Morgan Chase Kaggle Kickstarter las vegas airport lasso regression Lead Data Scienctist Lead Data Scientist leaflet league linear regression Logistic Regression machine learning Maps market matplotlib Medical Research Meet the team meetup methal health miami beach movie music Napoli NBA netflix Networking neural network Neural networks New Courses NHL nlp NYC NYC Data Science nyc data science academy NYC Open Data nyc property NYCDSA NYCDSA Alumni Online Online Bootcamp Online Training Open Data painter pandas Part-time performance phoenix pollutants Portfolio Development precision measurement prediction Prework Programming public safety PwC python Python Data Analysis python machine learning python scrapy python web scraping python webscraping Python Workshop R R Data Analysis R language R Programming R Shiny r studio R Visualization R Workshop R-bloggers random forest Ranking recommendation recommendation system regression Remote remote data science bootcamp Scrapy scrapy visualization seaborn seafood type Selenium sentiment analysis sentiment classification Shiny Shiny Dashboard Spark Special Special Summer Sports statistics streaming Student Interview Student Showcase SVM Switchup Tableau teachers team team performance TensorFlow Testimonial tf-idf Top Data Science Bootcamp Top manufacturing companies Transfers tweets twitter videos visualization wallstreet wallstreetbets web scraping Weekend Course What to expect whiskey whiskeyadvocate wildfire word cloud word2vec XGBoost yelp youtube trending ZORI