Webscraping Every Platinum Record: What Happened to the Album?

Posted on Oct 30, 2017


Over the course of the past 20 years, the music industry has experienced an incredibly drastic change and in particular it has to do with the way in which we consume music. Music services like Spotify and Apple Music are wonderfully intimate forms of which we listen to music. In the digital age we live in,Β we, the user, have full autonomy over what we get to listen to. We can press skip, we press pause, and we can compile songs from albums and singles and put them together into playlist. How has this changed consumer behavior though?Β Given the digital platform that exists now, there has grown an extreme saturation of artists. Anyone can viably learn and instrument, record a song on their phone, and upload it to Spotify. So with such an abundance of music in the world,Β  how does one go about garnering attention and stand out?Β In short, we can study the trend of the industry nowadays and see that for an up and coming artist. They should stop making albums and release music in singles.

To study these trends, I chose to look at the most successful albums and singles throughout time as inferred from the Recording Industry Association of America (RIAA). The RIAA is the organization that certifies records as either gold, platinum, multiplatinum, or diamond. These are certified by the organization when applied for by either an artist or label via proof of sales. The RIAA defines gold as the sale of 500,000 units, platinum - 1 million units, multi-platinum - multiple of million units, and diamond - 10 million units. The specifics are defined on the RIAA website, but just know that a unit is considered either the sale of a physical or digital copy of an album or single. To adjust for music streaming companies, the RIAA in 2016 also defined a unit as either 150 stream per single or 1500 streams per album.

Click here if you want learn more details one how the RIAA defines a unit and their certification process.

The Data

The data was obtained from scraping the RIAA’s website listing all awards ever given out since its inception, which can be found on the following link: https://www.riaa.com/gold-platinum/. Each entry is of the most recent award that a particular album or single received. These for each album or single to be awarded we have listed the: artist, title, the date it is was most recently awarded, the name of the record label, the format (single or album), the release date, the type (digital or physical copy), the group type (band, solo, or duo), and the genre.

Top Charters

To start, let's take a look at the top selling artists in the domain of albums as seen below in Figure 1.

Figure 1. Top 30 albums that have achieved at least platinum status (sold 1 million album units).

Now, compare this to the top selling artists in the domain of singles as seen below in Figure 2. Do you notice anything between the names of the artists in Figure 1 vs. Figure 2?

Figure 2. The top 30 artists whose total count singles that have achieved at least platinum status (sold 1 million album units).

The trend between the two of these should be relatively apparent. The top-selling album artists are all from an era that was specifically pre-Internet, while many of the artists in the top-selling single artists are very current and are still making commercially successful music even today. Hmm... The trend here seems to say that singles are more popular nowadays, but what does this mean really? Well, to answer this, let's have at the effect of album versus singles sales in the context of what I consider the two huge digital disruptors in music: Napster and Spotify.

The Singles Trend

If we now divvy up the distribution of albums and singles into eras of post and pre Napster and Spotify eras, we start to get a better picture for how much these two industry disruptors really changed the music industry. Furthermore, we can see that singles are by far outweighing albums in commercial success as depicted in the time series graph below where the red line is the percentage of all sales that were certified as albums versus the blue line for singles. We see that at the launch of Napster begins an immediate drop in sales of albums generally. Alternatively we see the opposite for singles. We also see that, not so coincidentally, at the moment of the Spotify launch marks the swap between singles and albums.

Figure 3. Time series graph of the percentage of all awards that were albums (red line) and singles (blue line).


So what does this mean for an artist? Is there any insight to be had from this information? Well, we can see singles are now dominating the charts in terms of music. In terms of success, the metric to which the RIAA uses for streams is debatable, but it still illustrates that singles are the most prominent form of the major of commercially successful music. Thus, it goes without saying that singles are going to be the best way for an artist to tap fame. I think it is important to note that this is for popular music. This is definitely not a suggestion for all genres, but more of the charting genres: pop, hip/hop, alternative, and dance music. Lastly, many of the artists whose albums were incredibly successful actually include many of the ones that dominate the singles charts. What I think this really means that these artist have garnered enough attention to make a album a commercial success. Thus, the very concept of the "album" as the most popular medium to which deliver a piece of musical work is arguably outdated, and singles are and going to continue to rule the future of popular music for years to come.

About Author

Related Articles

Leave a Comment

No comments found.

View Posts by Categories

Our Recent Popular Posts

View Posts by Tags

#python #trainwithnycdsa 2019 2020 Revenue 3-points agriculture air quality airbnb airline alcohol Alex Baransky algorithm alumni Alumni Interview Alumni Reviews Alumni Spotlight alumni story Alumnus ames dataset ames housing dataset apartment rent API Application artist aws bank loans beautiful soup Best Bootcamp Best Data Science 2019 Best Data Science Bootcamp Best Data Science Bootcamp 2020 Best Ranked Big Data Book Launch Book-Signing bootcamp Bootcamp Alumni Bootcamp Prep boston safety Bundles cake recipe California Cancer Research capstone car price Career Career Day citibike classic cars classpass clustering Coding Course Demo Course Report covid 19 credit credit card crime frequency crops D3.js data data analysis Data Analyst data analytics data for tripadvisor reviews data science Data Science Academy Data Science Bootcamp Data science jobs Data Science Reviews Data Scientist Data Scientist Jobs data visualization database Deep Learning Demo Day Discount disney dplyr drug data e-commerce economy employee employee burnout employer networking environment feature engineering Finance Financial Data Science fitness studio Flask flight delay gbm Get Hired ggplot2 googleVis H20 Hadoop hallmark holiday movie happiness healthcare frauds higgs boson Hiring hiring partner events Hiring Partners hotels housing housing data housing predictions housing price hy-vee Income Industry Experts Injuries Instructor Blog Instructor Interview insurance italki Job Job Placement Jobs Jon Krohn JP Morgan Chase Kaggle Kickstarter las vegas airport lasso regression Lead Data Scienctist Lead Data Scientist leaflet league linear regression Logistic Regression machine learning Maps market matplotlib Medical Research Meet the team meetup methal health miami beach movie music Napoli NBA netflix Networking neural network Neural networks New Courses NHL nlp NYC NYC Data Science nyc data science academy NYC Open Data nyc property NYCDSA NYCDSA Alumni Online Online Bootcamp Online Training Open Data painter pandas Part-time performance phoenix pollutants Portfolio Development precision measurement prediction Prework Programming public safety PwC python Python Data Analysis python machine learning python scrapy python web scraping python webscraping Python Workshop R R Data Analysis R language R Programming R Shiny r studio R Visualization R Workshop R-bloggers random forest Ranking recommendation recommendation system regression Remote remote data science bootcamp Scrapy scrapy visualization seaborn seafood type Selenium sentiment analysis sentiment classification Shiny Shiny Dashboard Spark Special Special Summer Sports statistics streaming Student Interview Student Showcase SVM Switchup Tableau teachers team team performance TensorFlow Testimonial tf-idf Top Data Science Bootcamp Top manufacturing companies Transfers tweets twitter videos visualization wallstreet wallstreetbets web scraping Weekend Course What to expect whiskey whiskeyadvocate wildfire word cloud word2vec XGBoost yelp youtube trending ZORI