Data Visualization on Spotify's Progress Overtime
The skills I demoed here can be learned through taking Data Science with Machine Learning bootcamp with NYC Data Science Academy.
The medium through which we listen to music has been changing from cassette tapes, cds, mp3 players, and now to streaming services. Thus, it was only logical to explore the popular songs from Spotify, the leading music streaming market, if I wanted to explore the data trends in music and how it has changed over the past 10 years. If any trends exist among the features of the popular songs from the Spotify, it might be helpful to predict the future music trend for the entertainment business to be more successful.
First, I wanted to explore the most popular feature for past 10 years overall. According to the data, Memories by Maroon 5 turned out the be the most famous song for the past 10 years. Also, Katy Perry is the most popular artist with dance pop being the most popular genre.
Not only that, with the various characteristics of the song that Spotify offers, I was able to check the other popular features of the song. For example, BPM (beats per minute) of about 124 was the most common among the songs, and energy (the higher the value, the more energetic) was about 81.
Also, danceability (the higher the value, the easier it is to dance to the song), loudness (the higher the value, the louder the song), liveness (the higher the value, the more likely the song is a live recording), valence (the higher the value, the more positive mood for the song), duration (the length of the song), acousticness (the higher the value, the more acoustic the song is), and speechiness (the higher the value, the more spoken word the song contains) was 69, -5, 10, 45, 220, 3.4, and 5, respectively. With this data, the most popular features were revealed, but how did the features change over the span of 10 years?
Average of Each Feature Every Year
It led me to check by observing the average of each feature every year.
BPM seemed like it was getting higher until the year of 2014 but started decreasing, while energy tends to keep on decreasing. Danceability started increasing significantly starting from the year of 2013, and the loudness has been fluctuating since the year of 2014. Liveness has been decreasing since 2016, valence has been fluctuating since 2015, and duration tended to decrease since 2015, meaning people are preferring shorter songs. While the acousticness significantly increased in 2018, speechiness fluctuated but seems to be on a general decline since 2017.
After exploring the change in trends annually, I also started wondering: how much does each feature contribute to the popularity of the song?
I decided to create scatterplots and density maps to more accurately identify the frequency of a feature’s value against the popularity of the song and checked the approximated median of the most dense area. Surprisingly, all of the features seem to be the most dense near the values in the 70s, which indicates that songs with popularity values around 70 have more mainstream characteristics than the ones on either extreme side of the popularity spectrum.
But is there any correlation between the features? Over the span of 10 years all together, there seems to be no strong correlation between year and features or popularity and features. However, among the features, energy and loudness, danceability and valence, and energy and acousticness seem to have correlations.
With the data analyzed above, I was able to explore the trend of music in past 10 years. However, this made me want to go more in depth dataset-wise by comparing data with the Billboard’s chart of 10 years, or with the list of songs that got an award to see if any other trends exist between them.