MLS Salaries 2007-2019

Jay Cohen
Posted on May 17, 2020

Introduction

Every season since 2007, the Major League Soccer Players Association ("MLSPA") has published a report containing the salaries of every player in the League. Those reports can be found here

I created a Shiny App to allow users to explore the data in those reports.

Data Tab

The Data Tab contains a table with all of the data from the MLSPA:

The data can be filtered by club, season, and position. 

By Season Tab

The By Season Tab examines player salaries across seasons. 

The By Seasons Tab includes two sub-tabs: 2007-2019 Boxplots and Season vs Season Histograms.

On top of the 2007-2019 Boxplots Sub-Tab is a boxplot that shows each season's distribution of player salaries. That is, for each season, the chart shows the distribution of the Annualized Average Guaranteed Compensation for every player on every club during that season.

Below that boxplot, I have included what is essentially a zoomed-in image of the lower region of the boxplot on top:

Not surprisingly, salaries at all levels have increased since 2007.

These charts also demonstrate the impact of collective bargaining on salaries. The League and the MLSPA reached a new collective bargaining agreement ("CBA") in 2015 that ran through the 2019 season. Focusing on these last five seasons, while the median salary has increased since 2014, most of the salary gains have come in 50th-percentile-to-75-percentile quartile. In 2014, a player in the 75th percentile made approximately $180,000 per season. By 2019, a player in the 75th percentile made approximately $500,000, more than two-and-one-half times the 2014 figure. Also in the 2015 CBA, the League introduced Targeted Allocation Money ("TAM"). Because TAM may only be used on players earning between approximately $450,000 and $1,500,000, clubs are incentivized to retain players in that salary band. Accordingly, since 2014, the number of players earning between $450,000 and $1,500,000 has risen from 17 to 161. (Note that the methodology the League uses to determine a player's salary for purposes of TAM eligibility is slightly different than the methodology that the MLSPA uses to calculate a player's Annualized Average Guaranteed Compensation.)

The Season vs Season Histograms Sub-Tab includes histograms of two seasons of the user's choice. Salaries above $1,000,000 are grouped together in a final bin.

By Position Tab

The By Position Tab compares player salaries across positions. The user can filter the data by club and season. 

Surprisingly, between 2007 and 2009, clubs were actually paying more for midfielders than they were for strikers:

Those figures are, however, influenced by David Beckham's $6,500,000 Annualized Average Guaranteed Compensation (more than double anyone else in the League during that period). By 2019, clubs had settled in to the pattern of paying strikers more than midfielders, midfielders more than defenders, and defenders more than goalkeepers:

By Club Tab

The By Club Tab allows the user to explore how different clubs compose their squads. The user selects two clubs during particular seasons (i.e., the user can choose to compare the New York Red Bulls 2019 to the New York Red Bulls 2018). Below the user's selections are boxplots showing the distribution of player salaries on that club during that season:

Then, below those boxplots, are zoomed-in images of the lower regions of the boxplots on top, just as on the By Season Tab:

Points By Club Spend Tab

The last tab on the App examines whether there is a relationship between the amount of money a club spends on player salaries in a season and how many points that club gained during that season. The scatterplot on top shows the points per match a club picked up during a season compared to how much money the club spent on player salaries during that season:

However, because salaries increased from 2007 to 2019, before performing a linear regression on the data, I performed a logarithmic transformation of the club season salary total. Following that transformation, I replotted the data and performed a linear regression:

Somewhat surprisingly, there was very little correlation between how much money a club spent on player salaries and the points that club gained during the season. 

______________________________________________________________________________

Hopefully this App provides a means for you to explore the available salary data and draw some additional conclusions of your own.

About Author

Jay Cohen

Jay Cohen

A Harvard College and Harvard Law School graduate, Jay has several years of experience working in law, including representing sports leagues and teams. He now hopes to combine his legal skills with data science skills to assist sports...
View all posts by Jay Cohen >

Leave a Comment

No comments found.

View Posts by Categories


Our Recent Popular Posts


View Posts by Tags

#python #trainwithnycdsa 2019 airbnb Alex Baransky alumni Alumni Interview Alumni Reviews Alumni Spotlight alumni story Alumnus API Application artist aws beautiful soup Best Bootcamp Best Data Science 2019 Best Data Science Bootcamp Best Data Science Bootcamp 2020 Best Ranked Big Data Book Launch Book-Signing bootcamp Bootcamp Alumni Bootcamp Prep Bundles California Cancer Research capstone Career Career Day citibike clustering Coding Course Demo Course Report D3.js data Data Analyst data science Data Science Academy Data Science Bootcamp Data science jobs Data Science Reviews Data Scientist Data Scientist Jobs data visualization Deep Learning Demo Day Discount dplyr employer networking feature engineering Finance Financial Data Science Flask gbm Get Hired ggplot2 googleVis Hadoop higgs boson Hiring hiring partner events Hiring Partners Industry Experts Instructor Blog Instructor Interview Job Job Placement Jobs Jon Krohn JP Morgan Chase Kaggle Kickstarter lasso regression Lead Data Scienctist Lead Data Scientist leaflet linear regression Logistic Regression machine learning Maps matplotlib Medical Research Meet the team meetup Networking neural network Neural networks New Courses nlp NYC NYC Data Science nyc data science academy NYC Open Data NYCDSA NYCDSA Alumni Online Online Bootcamp Online Training Open Data painter pandas Part-time Portfolio Development prediction Prework Programming PwC python python machine learning python scrapy python web scraping python webscraping Python Workshop R R language R Programming R Shiny r studio R Visualization R Workshop R-bloggers random forest Ranking recommendation recommendation system regression Remote remote data science bootcamp Scrapy scrapy visualization seaborn Selenium sentiment analysis Shiny Shiny Dashboard Spark Special Special Summer Sports statistics streaming Student Interview Student Showcase SVM Switchup Tableau team TensorFlow Testimonial tf-idf Top Data Science Bootcamp twitter visualization web scraping Weekend Course What to expect word cloud word2vec XGBoost yelp