Exploring Spotify's Global Artist Markets

Posted on Jun 14, 2021

Introduction

Spotify is one of the largest streaming and media services in the world. Having a better understanding of the Spotify markets would be highly beneficial to artists, producers, and labels alike. The highest importance is placed on understanding the popularity score that Spotify calculates for each artist, as it provides a standardized metric by which the artist can compare themselves to the competition.

Previously, we explored artists in the US market to better understand what the relationship between artists' popularity and follower count. We saw an expected correlation: as the popularity of an artist increased, so did their number of followers. However, this trend is only visible when looking at the macro scale of the entire US market. When looking at follower counts within a popularity range, we see numerous artists that have more followers than artists at higher popularity scores. This was seen for scores up to popularity 90, after which the number of artists at each popularity was reduced considerably. From this, we gathered that a high follower count does not necessitate a high popularity. In other words, artists need to focus on getting their tracks played, not acquiring followers, to increase their popularity.

With this in mind, we move on to determining whether this trend exists in foreign markets as well. It is likely that this trend is present in all markets for several reasons. First, the scoring system used by Spotify is the same across all markets. Second, highly popular artists likely have an international presence. We will also briefly explore the genres of US artists, to see if there is anything that may help us understand how a US artist can grow their popularity. Confirming this trend will help us set a baseline for future work.

Data Preparation

We again used the US Spotify Tracks dataset available on Kaggle. In addition, we will use another dataset, also available on Kaggle, with the artist market data in 125 foreign markets. For the genre, which is a part of the previously mentioned US market dataset.

We will manipulate and clean the artist market data in Python, using the Pandas and Numpy libraries. The cleaned data will be exported to .csv files, which we will import into R. We can then manipulate these datasets to generate the necessary visualizations and statistics.

Visualizing the Analysis

We can take a side-by-side look at the US and Japan markets to see if trends are similar:

As expected, the Japan market shows the trend seen in the US market. We will adjust the bins to see how this looks at each popularity above 80:

In general, an increase in popularity means an artists will have more followers. Furthermore, we still see artists that have more followers than other artists having greater popularity.

We will now explore whether genres overall have any relation to popularity. We get the following distribution for number of genres at each popularity:

The genres follow a normal distribution, centered around a popularity of 45. A significant number of genres have zero popularity, which makes sense considering unique genres will be associated with artists of zero popularity. There are also not many genres associated with artists of increasingly high popularity. This is expected, since the number of artists decreases significantly at higher popularity scores. These highly popular artists will likely not have many unique genres associated with them.

 

Conclusion

Although we did not find anything novel in this analysis, we can confirm that the trend seen in the US market is actually a trend pervasive across all markets. This finding helps to simplify our future work; we can infer that any trends we find within the US market will also be present in foreign markets.

By looking at US market genres as a whole, we could not obtain any information to concretely establish a relationship between an artists' genre and popularity.

Future Work

With this baseline established, our next analyses can delve deeper into other relationships that might impact an artists' Spotify popularity score. For instance, we can take a closer look at whether the number of genres an artist has correlates with their popularity. Another angle would be to see whether any particular genre is more popular than others, though this will require us to categorize all genres first. In addition, we are free to explore possible trends between a song's composition and an artist's popularity.

About Author

Aleksey Klimchenko

BS in Bioinformatics & Molecular Biology RPI '17
View all posts by Aleksey Klimchenko >

Leave a Comment

No comments found.

View Posts by Categories


Our Recent Popular Posts


View Posts by Tags

#python #trainwithnycdsa 2019 airbnb Alex Baransky alumni Alumni Interview Alumni Reviews Alumni Spotlight alumni story Alumnus API Application artist aws beautiful soup Best Bootcamp Best Data Science 2019 Best Data Science Bootcamp Best Data Science Bootcamp 2020 Best Ranked Big Data Book Launch Book-Signing bootcamp Bootcamp Alumni Bootcamp Prep Bundles California Cancer Research capstone Career Career Day citibike clustering Coding Course Demo Course Report D3.js data Data Analyst data science Data Science Academy Data Science Bootcamp Data science jobs Data Science Reviews Data Scientist Data Scientist Jobs data visualization Deep Learning Demo Day Discount dplyr employer networking feature engineering Finance Financial Data Science Flask gbm Get Hired ggplot2 googleVis Hadoop higgs boson Hiring hiring partner events Hiring Partners Industry Experts Instructor Blog Instructor Interview Job Job Placement Jobs Jon Krohn JP Morgan Chase Kaggle Kickstarter lasso regression Lead Data Scienctist Lead Data Scientist leaflet linear regression Logistic Regression machine learning Maps matplotlib Medical Research Meet the team meetup Networking neural network Neural networks New Courses nlp NYC NYC Data Science nyc data science academy NYC Open Data NYCDSA NYCDSA Alumni Online Online Bootcamp Online Training Open Data painter pandas Part-time Portfolio Development prediction Prework Programming PwC python Python Data Analysis python machine learning python scrapy python web scraping python webscraping Python Workshop R R language R Programming R Shiny r studio R Visualization R Workshop R-bloggers random forest Ranking recommendation recommendation system regression Remote remote data science bootcamp Scrapy scrapy visualization seaborn Selenium sentiment analysis Shiny Shiny Dashboard Spark Special Special Summer Sports statistics streaming Student Interview Student Showcase SVM Switchup Tableau team TensorFlow Testimonial tf-idf Top Data Science Bootcamp twitter visualization web scraping Weekend Course What to expect word cloud word2vec XGBoost yelp