Data Analysis on the Progression of Video Games
The skills the author demoed here can be learned through taking Data Science with Machine Learning bootcamp with NYC Data Science Academy.
I used to be an avid gamer from the days of the Super Nintendo System all the way to the recent Playstation 3 system. However, I did take a break and I haven't played video games for the past five years. I can't even recognize the games of today anymore. A lot has happened and with that in mind, I wanted to explore further to see the general data trends of video game sales for the past 20 or even 30 years.
Data set can be downloaded from here. (Kaggle dataset) I wanted to do some basic exploratory data analysis to get some insight on how and which games sell the most. I've also created a tableau visualization which can be found here. And finally, my code can be found on my GitHub repository which can be found here.
As you can see, the games are predominantly on the Nintendo DS with games on the Playstation 2 a close second.
More important than that however I also wanted to get a scope of the global sales, both by platform and by year as well. (sales are in millions)
From this chart, we can see that within the data set, most global sales come from Playstation 2 predominantly, with a good mix of Playstation 3 and Xbox 360 thrown in. These games seemed to be from an older generation so I wanted to see how dated my set was. The next step was to see global_sales per year.
From our data set, most global sales were between the years 2007 and 2010. Possible explanations for this include the new release of the PS3 and Xbox360 systems as well as the so-called prime age for the popular PS2 model.
I created a histogram to see how well games were selling in general.
It seems like there's an outlier(which we'll get to later) so I took a closer look instead.
The histogram may be right-skewed but this data frame includes many games that were released but didn’t do well. For every best-seller like Final Fantasy 7 out there, there are many more like the poorly received “Men in Black II: Alien Escape”
I needed to check the global sales per game to see how these years differ from the others.
This scatterplot shows a great outlier in around 2007. More than two times the sales of the far next best selling games. I got the head of the data set showing important columns to see just what this game was and how far the next games were.
The numbers are in millions and it's clear to see that Wii Sports is our outlier. More than two times the next best selling game, Super Mario Bros.
I then noticed the trend that all 5 of the best games were from Nintendo. The next step was to see how strong each publisher was for each area to get an idea of the different regions' tastes.
It's interesting to note the similar tastes in games when it comes to North America and Europe. Mainly Nintendo, with the same order of studios following closely behind. Japan and the rest of the world, on the other hand, showed a different story.
While Nintendo was still the strongest, Japan and the rest of the world showed more love for Sony Computer Entertainment.
To dive even deeper into this, I wanted to explore the differences in the genre per location to get an idea of what type of games sell best and where.
If the graphs are small in this blog you can easily go over to the tableau page provided above and get a closer look.
Throughout all the regions, the best-sellers are; Action, Sports, and Shooter. And the highest markets are North America and Europe. Interesting to note that Japan is the prime market when it comes to Role-playing games. (Their highest selling genre)
It's clear to see that games sell best in North America and Europe. Mainly of the action or shooter type. Games do sell well in Japan as well if they're role-playing games. There is seasonality for when games sell best but I think I'll need an updated data set to prove that hypothesis. Finally, the greatest selling games are from Nintendo, which ironically, focuses on family games and not action shooters.
It was an interesting deep dive but again for future studies, I'd like to see an updated version of the data set to see if indeed, games sell the best during generational shifts in video game platforms.