10 Years of Spotify

Jay Kim
Posted on Aug 17, 2020

The medium through which we listen to music has been changing from cassette tapes, cds, mp3 players, and now to streaming services. Thus, it was only logical to explore the popular songs from Spotify, the leading music streaming market, if I wanted to explore the trends in music and how it has changed over the past 10 years. If any trends exist among the features of the popular songs from the Spotify, it might be helpful to predict the future music trend for the entertainment business to be more successful.


First, I wanted to explore the most popular feature for past 10 years overall. According to the data, Memories by Maroon 5 turned out the be the most famous song for the past 10 years. Also, Katy Perry is the most popular artist with dance pop being the most popular genre.

Not only that, with the various characteristics of the song that Spotify offers, I was able to check the other popular features of the song. For example, BPM (beats per minute) of about 124 was the most common among the songs, and energy (the higher the value, the more energetic) was about 81. Also, danceability (the higher the value, the easier it is to dance to the song), loudness (the higher the value, the louder the song), liveness (the higher the value, the more likely the song is a live recording), valence (the higher the value, the more positive mood for the song), duration (the length of the song), acousticness (the higher the value, the more acoustic the song is), and speechiness (the higher the value, the more spoken word the song contains) was 69, -5, 10, 45, 220, 3.4, and 5, respectively. With this data, the most popular features were revealed, but how did the features change over the span of 10 years?


It led me to check by observing the average of each feature every year. BPM seemed like it was getting higher until the year of 2014 but started decreasing, while energy tends to keep on decreasing. Danceability started increasing significantly starting from the year of 2013, and the loudness has been fluctuating since the year of 2014. Liveness has been decreasing since 2016, valence has been fluctuating since 2015, and duration tended to decrease since 2015, meaning people are preferring shorter songs. While the acousticness significantly increased in 2018, speechiness fluctuated but seems to be on a general decline since 2017. After exploring the change in trends annually, I also started wondering: how much does each feature contribute to the popularity of the song?

I decided to create scatterplots and density maps to more accurately identify the frequency of a feature’s value against the popularity of the song and checked the approximated median of the most dense area. Surprisingly, all of the features seem to be the most dense near the values in the 70s, which indicates that songs with popularity values around 70 have more mainstream characteristics than the ones on either extreme side of the popularity spectrum.

But is there any correlation between the features? Over the span of 10 years all together, there seems to be no strong correlation between year and features or popularity and features. However, among the features, energy and loudness, danceability and valence, and energy and acousticness seem to have correlations.
With the data analyzed above, I was able to explore the trend of music in past 10 years. However, this made me want to go more in depth dataset-wise by comparing data with the Billboard’s chart of 10 years, or with the list of songs that got an award to see if any other trends exist between them.

About Author

Jay Kim

Jay Kim

BA in Psychology at NYU & Assistant Accountant
View all posts by Jay Kim >

Leave a Comment

No comments found.

View Posts by Categories


Our Recent Popular Posts


View Posts by Tags

#python #trainwithnycdsa 2019 airbnb Alex Baransky alumni Alumni Interview Alumni Reviews Alumni Spotlight alumni story Alumnus API Application artist aws beautiful soup Best Bootcamp Best Data Science 2019 Best Data Science Bootcamp Best Data Science Bootcamp 2020 Best Ranked Big Data Book Launch Book-Signing bootcamp Bootcamp Alumni Bootcamp Prep Bundles California Cancer Research capstone Career Career Day citibike clustering Coding Course Demo Course Report D3.js data Data Analyst data science Data Science Academy Data Science Bootcamp Data science jobs Data Science Reviews Data Scientist Data Scientist Jobs data visualization Deep Learning Demo Day Discount dplyr employer networking feature engineering Finance Financial Data Science Flask gbm Get Hired ggplot2 googleVis Hadoop higgs boson Hiring hiring partner events Hiring Partners Industry Experts Instructor Blog Instructor Interview Job Job Placement Jobs Jon Krohn JP Morgan Chase Kaggle Kickstarter lasso regression Lead Data Scienctist Lead Data Scientist leaflet linear regression Logistic Regression machine learning Maps matplotlib Medical Research Meet the team meetup Networking neural network Neural networks New Courses nlp NYC NYC Data Science nyc data science academy NYC Open Data NYCDSA NYCDSA Alumni Online Online Bootcamp Online Training Open Data painter pandas Part-time Portfolio Development prediction Prework Programming PwC python python machine learning python scrapy python web scraping python webscraping Python Workshop R R language R Programming R Shiny r studio R Visualization R Workshop R-bloggers random forest Ranking recommendation recommendation system regression Remote remote data science bootcamp Scrapy scrapy visualization seaborn Selenium sentiment analysis Shiny Shiny Dashboard Spark Special Special Summer Sports statistics streaming Student Interview Student Showcase SVM Switchup Tableau team TensorFlow Testimonial tf-idf Top Data Science Bootcamp twitter visualization web scraping Weekend Course What to expect word cloud word2vec XGBoost yelp