Building a Video Game Recommendation System

Brenna Botzheim

Posted on Aug 20, 2020

This blog goes through how I created a video game recommendation system for my NYC Data Science Academy Capstone Project. The recommender app is located here.

Building a recommendation system involved many steps ranging from collecting the data, preparing it, building a model, and deploying an interactive app. I collected the data from Metacritic, using Scrapy to gather user reviews and game data about every video game on the site. That data was then processed, adding unique identifiers for every game and user. The data was explored visually using Python. Then, I built a model using the reviews of each game. The model was built using Doc2Vec, an unsupervised algorithm that generated vectors for each game based on reviews. These vectors could then be compared to one another to find similarities. The final product of this project is an app that can be used to recommend video games based on various inputs from a user. The app is a single page web application using flask and jquery, and was built in cooperation with Andrew Clarry, a friend of mine who is more knowledgeable in the field of web development.

Here is a relatively brief overview of the process to build this recommendation system.

Scraping the Data

This project primarily utilizes Python, and Scrapy was chosen for the web-scraping task. Metacritic is luckily not a JavaScript heavy website, and the necessary information could easily be obtained by crawling through the HTML.

Data was collected on every game, including the game title, developer, publisher, release date, and platform for the game (like Xbox, Playstation, etc). Data was additionally scraped for every review of every video game. This included mainly the username, review text, and rating. The resulting data included about 17,000 games and over 1,000,000 reviews.

Data Processing

Once the data was collected, it needed to be processed for longer-term storage and prepared for the model. First the data was combed through to understand what it looked like and get a feel for how many missing values existed. Some missing values were easily fixable, but predominantly these were caused by the data not existing on Metacritic to begin with. The most important information, including video game identifiers and review texts, were almost entirely accounted for.

In preparation for building the model, the reviews were processed to remove unnecessary punctuation and unhelpful, frequently occurring words. They were then 'tagged' with the game ID corresponding to the review, so that the model can compare by game.

Building the Model

Doc2Vec is a pretty straight forward model to train. The important part is to tag each review with the game ID. The model then builds a vector for each tag, based on the review text for that tag. Below is the code for building the model. The 'build_vocab' function enables the user to search for game recommendations based on keywords, so long as the keyword is within the text corpus. Then the model is trained on the reviews:

Now you can see below, that the trained model can be used for simple querying to return the most relevant games.

A search based on a game I already like:

game-result.jpg-631941-BmQau1ep | Data Science Blog

A search based on a keyword that I'm interested in:

keyword-result.jpg-794073-Kt49L4I9 | Data Science Blog

Deploying the Model in a User-Friendly App

The above was some very simple querying with very relevant results already, but a better way to implement this recommender would be through a user-friendly app that allows for some filtering.

I teamed up with a friend of mine, Andrew Clarry, to design an app for this purpose. The final result allows a user to refine results by the platform they would like to play on, genres they are specifically interested in, and a game title or keyword to get a result from the model. While platform and genre are optional inputs, the game title or keyword is required to get recommendations. The result is a pretty nifty, lightweight game recommendation app that scarily predicts games that I have on my Steam wish list! A more advanced implementation of this in the future could account for past user interests and filters out redundant recommendations (such as games previously played). You can check out the results of this project here.

About Author

Brenna Botzheim

Brenna Botzheim is an associate EOV Analyst at StormGeo. Brenna holds a Bachelors degree from San Francisco State University where she studied sociology and mathematics. In her spare time, Brenna continues to develop her skills in statistical data...

View all posts by Brenna Botzheim >

Capstone

Using NLP to Explore Unconventional Targets

Capstone

Blind Dating Ensemble Classifier

Student Works

Data Driven Ads by Starbucks Customer Segmentation

Machine Learning

Accurately Predicting House Prices and Improving Client Experience with Machine Learning

Capstone

Finding the Best Liquor Store Location in Iowa

No comments found.

Building a Video Game Recommendation System

Scraping the Data

Data Processing

Building the Model

Deploying the Model in a User-Friendly App

About Author

Brenna Botzheim

Related Articles

Leave a Comment

View Posts by Categories

Our Recent Popular Posts

View Posts by Tags

NYC Data Science Academy

Get detailed curriculum information about our
amazing bootcamp!

Offerings

About

SOCIAL MEDIA

Building a Video Game Recommendation System

Scraping the Data

Data Processing

Building the Model

Deploying the Model in a User-Friendly App

About Author

Brenna Botzheim

Related Articles

Leave a Comment

View Posts by Categories

Our Recent Popular Posts

View Posts by Tags

NYC Data Science Academy

Get detailed curriculum information about our amazing bootcamp!

Offerings

About

SOCIAL MEDIA

Get detailed curriculum information about our
amazing bootcamp!