Business School Rankings: Are they really important ?

Posted on Jun 22, 2016

Pursuing a MBA is a dream for many students and is one of the most important decisions that one can make for his/her career. Though there are several benefits in doing a Full time MBA, it is still a very expensive proposition given the time and financial commitments that one would have to make during the course of the program. I wanted to analyze the Full time MBA program rankings of various business schools and see if the rank or tier of a Business school really matters in determining the return on investment. The following questions were to be addressed as a part of my analysis:

  1. Does the Business school's rank impact the post MBA compensation?
  2. How do the International Business Schools compare against their U.S. counterparts regarding the overall quality of the program and faculty?
  3. Are the average GMAT scores and work experience of the incoming cohort the same across all tiers of business schools?

In addition to answering the above questions, I wanted to build a Machine Learning model that would predict if a given business school is a top-tier school based on the school's admission, faculty, and program related data.

Data Preparation & Pre-processing

The data for the business school rankings were scraped from The Economist website using the Beautiful Soup library in Python. This website has the rankings starting from 2011 to 2015, and additional details regarding the school, program, faculty, tuition cost, and recruitment can be obtained from other pages once we click the school name in the home page.

Economist site home page

The data extraction was a bit time consuming given the multiple pages that had to be navigated and consolidated to extract the data. The final data set consisted of 500 observations across five years with 29 variables.

The field names were cleaned up to remove long phrases, underscores, and a new variable was added to reflect the world region corresponding to the school’s location. The data also had unwanted HTML tags and ASCII characters that were removed and the character set encoding was converted to Unicode Transformation Format (UTF - 8) to make them human readable.

Exploratory Data Analysis

The box plot below shows a clear difference in the distribution of GMAT score across the tiers. The average GMAT score for admission to Tier 1 schools were the highest followed by other tiers in the order as shown in the plot. There is a considerable difference in the average GMAT scores between Tiers 1 & 2 and between Tiers 3 & 4 compared to the difference between Tiers 2 and 3.

avg gmat                      post mba sal

The Distribution of average Post MBA compensation also shows a gradual declining slope as we move from tier 1 to tier 4 schools. Based on this plot, we can infer that the business school tends to have a significant impact on post-graduate compensation. There are a few outliers for tiers 3 and 4 (especially on the lower end) because of Asian schools that report relatively lower compensation. It also seems those schools may not have factored in the purchasing power of their respective countries while reporting compensation in US dollars.


avg work experience             post mba sal dist

The pre-MBA distribution plot of average work experience has a long right tail and is positively skewed with a mean of 5.5 years. Based on further analysis, we can infer that students in Australia and Far East Asia tend to work longer before pursuing a full time MBA compared to students in Europe, America and South Asia.

The average annual compensation is normally distributed with a mean annual salary around USD 105,000 and there are no significant extreme values.

Hypothesis Tests

Hypothesis tests were run to check the GMAT score, student diversity, faculty quality, and program rating were statistically significant. These variables were found to be statically significant using the Analysis of Variance (ANOVA) test except the MBA Program rating. This shows that the students' perception of the overall MBA program of their respective schools is not vastly different, and the ratings across the world are pretty much in line with each other.

Predictive Modeling

The next step was to create a Logistic Regression model using the variables below. Logistic Regression is a supervised Machine Learning algorithm that returns the probability of the response variable taking a particular value based on combination of values taken by the predictors.

  • Average GMAT score
  • Post MBA salary
  • Percentage who received job offer
  • Percent graduates finding jobs through school’s career services
  • Student rating of program
  • Student rating of careers service
  • Ratio of Faculty to students

Model Building & Outcome

The response variable in this algorithm is typically a binary variable. Since the goal of this analysis is to determine whether a business school is a top tier school or not, Logistic Regression would be a good candidate to accomplish the binary classification task.

Logistic Regression Results


Based on review of the logistic regression model the variables that were statistically significant (at the 5% significance level) are

  • Average GMAT score
  • Post MBA salary
  • Percentage who received job offer
  • Percent graduates finding jobs through school’s career services

The statistical significance of these variables indicate that they do play a key role in determining whether a school is a top tier school or not. The results of Logistic Regression can be interpreted as the odds of success of a school being a top tier school which can be explained using the below table.

Model Accuracy

The next step in the analysis was to check the accuracy of the model by looking at the confusion matrix. This matrix displays the results of predictions against the actual results from the observations for the training and test datasets in a tabular format that would help determine the percentage of accuracy in the predictions.

True Vs Predicted Value

The Training data set has 61 of the 68 observations predicted correctly and 7 observations were incorrectly classified. The Test data set had 29 of the 32 observations predicted correctly resulting in an overall prediction accuracy of 90%. Although the model's prediction seems to be good in this data set, the overall McFadden's R squared was ~66.2% indicating that the model can explain approximately 66% of the variability in the response variable. There is still some scope for improving the model's R squared value so that it can generalize well across unseen data sets.

The R code for the complete analysis as well as the Python code for web scraping can be reviewed at the GitHub repository:


The following insights were derived based on the predictive model and the exploratory data analysis performed:

  • The Rank or Tier of a Business school does seem to have an impact on the individual's annual compensation and ultimately the return on investment post-graduation.
  • GMAT Score, Annual Compensation(upon graduation), percentage of students who received a job offer, and effectiveness of the school’s Career Services are the four predictors that are statistically significant (at 5 % significance level) in determining whether a school is a Top tier Business school
  • The average GMAT score of the cohort is lower in Europe compared to the U.S. Also, Asia Pacific and Australian Business schools tend to have lower geographical diversity compared to their counterparts in Europe and U.S.
  • There is no statistical significance regarding the students’ rating (perception) of the full time MBA program across the various regions of the world. But there is a statistical significance regarding students rating of the faculty across the business schools
  • Outside of the U.S, Students tend to have more work experience before pursuing a full time MBA compared to U.S based business schools
  • Based on the principal recruiters’ information, consulting seems to be the industry of choice for the Full time MBA graduates followed by technology and financial services.

About Author

Nanda Rajarathinam

Nanda has been applying his analytical, problem-solving and team management skills, at a leading consulting firm, focusing on data engineering, solution architecture and analytics. He has a strong background in Build, implementation and performance optimization of Extract, Transform...
View all posts by Nanda Rajarathinam >

Leave a Comment

No comments found.

View Posts by Categories

Our Recent Popular Posts

View Posts by Tags

#python #trainwithnycdsa 2019 2020 Revenue 3-points agriculture air quality airbnb airline alcohol Alex Baransky algorithm alumni Alumni Interview Alumni Reviews Alumni Spotlight alumni story Alumnus ames dataset ames housing dataset apartment rent API Application artist aws bank loans beautiful soup Best Bootcamp Best Data Science 2019 Best Data Science Bootcamp Best Data Science Bootcamp 2020 Best Ranked Big Data Book Launch Book-Signing bootcamp Bootcamp Alumni Bootcamp Prep boston safety Bundles cake recipe California Cancer Research capstone car price Career Career Day citibike classic cars classpass clustering Coding Course Demo Course Report covid 19 credit credit card crime frequency crops D3.js data data analysis Data Analyst data analytics data for tripadvisor reviews data science Data Science Academy Data Science Bootcamp Data science jobs Data Science Reviews Data Scientist Data Scientist Jobs data visualization database Deep Learning Demo Day Discount disney dplyr drug data e-commerce economy employee employee burnout employer networking environment feature engineering Finance Financial Data Science fitness studio Flask flight delay gbm Get Hired ggplot2 googleVis H20 Hadoop hallmark holiday movie happiness healthcare frauds higgs boson Hiring hiring partner events Hiring Partners hotels housing housing data housing predictions housing price hy-vee Income Industry Experts Injuries Instructor Blog Instructor Interview insurance italki Job Job Placement Jobs Jon Krohn JP Morgan Chase Kaggle Kickstarter las vegas airport lasso regression Lead Data Scienctist Lead Data Scientist leaflet league linear regression Logistic Regression machine learning Maps market matplotlib Medical Research Meet the team meetup methal health miami beach movie music Napoli NBA netflix Networking neural network Neural networks New Courses NHL nlp NYC NYC Data Science nyc data science academy NYC Open Data nyc property NYCDSA NYCDSA Alumni Online Online Bootcamp Online Training Open Data painter pandas Part-time performance phoenix pollutants Portfolio Development precision measurement prediction Prework Programming public safety PwC python Python Data Analysis python machine learning python scrapy python web scraping python webscraping Python Workshop R R Data Analysis R language R Programming R Shiny r studio R Visualization R Workshop R-bloggers random forest Ranking recommendation recommendation system regression Remote remote data science bootcamp Scrapy scrapy visualization seaborn seafood type Selenium sentiment analysis sentiment classification Shiny Shiny Dashboard Spark Special Special Summer Sports statistics streaming Student Interview Student Showcase SVM Switchup Tableau teachers team team performance TensorFlow Testimonial tf-idf Top Data Science Bootcamp Top manufacturing companies Transfers tweets twitter videos visualization wallstreet wallstreetbets web scraping Weekend Course What to expect whiskey whiskeyadvocate wildfire word cloud word2vec XGBoost yelp youtube trending ZORI