Data Based Board Games Market Analysis

Posted on Nov 4, 2021

The skills we demoed here can be learned through taking Data Science with Machine Learning bootcamp with NYC Data Science Academy.

Data Science Background

GitHub | Linkedin

Data Science is perfect for what I want to research here.

I was late to the game. For most of my life, I was not a fan of board games. Fortunately, I was introduced to the joyful world of board games when I married into the Meyerovich family, who often spend Friday nights playing one of their favorite board games together.

It appears that the Meyerovich family members are not the only ones who are fond of board games; In 2019, the global board games market size was worth an estimated USD 13.1 Billion and is expected to grow significantly in the next few years. Accordingly, we witness a sharp increase in the number of board games released over the past 60 years (Figure 1). These trends present great opportunities for new businesses and products to enter the market of board games.

Data Science Objective

In this article, I share data-driven recommendations for designing successful board games. My goal is to inform individuals and businesses who are interested in entering the vibrant market of board games. To this end, I conducted explanatory data analysis (EDA) of over 20K board games using R.

Figure 1:

The Data

The dataset contains the attributes and the ratings for 20,343 board games from BoardGameGeek (BGG), an online resource and community that aims to be the definitive source for board game and card game content. It was obtained from Kaggle and can be accessed HERE.

Data Explanatory Analysis

I conducted an analysis which revolves around my research question: What constitutes highly rated board games? The dependent variable, average rating, is a numeric variable on a scale of 0-10. In the graph, the density curve of average rating is slightly skewed to the left, with a mean of 6.4 (Figure 2).

Figure 2:

Average Rating by Complexity Average

Complexity Average is a numeric variable on a scale of 0-5. High complexity score suggests that gamers perceive a game as relatively complicated, while low complexity score implies that users believe that a game is relatively simple. The density curve of complexity average is skewed to the right, with a mean of 2 (Figure 3). According to Figure 4, on average, as complexity average increases, the average rating rises as well. To better understand the association between the two variables, I divided the number of data points into quartile according to their complexity score and plotted the four groups against average rating (Figure 5).

This confirms that complex board games are favorable among BGG users, with a gap of over one point in average rating between games with low and very high complexity level.

Figure 3:


Data Based Board Games Market Analysis

Figure 5:

Average Rating by Play Time

Users prefer long games. Figure 6 visualizes the positive relationship between play time and average rating, suggesting that games with play time of over 90 minutes are more popular among users compared to all other play time groups.

Figure 6:

Average Rating by Number of Players

The information about number of players is divided into two segments: minimum number of players and maximum number of players. Board games designed for minimum one player, on average, have higher rating compared to all other groups, perhaps due to the flexibility they offer (Figure 7). Surprisingly, board games designed for maximum one player (i.e., solo games) also receive higher rating, on average, compared to multiplayers board games (Figure 8).

Figure 7:

Figure 8:

Average Rating by Domains

Figure 9 suggests that strategy games are associated with higher average rating compared to all other domains. Wargames and thematic games are next, with slightly lower average rating compared to strategy games. Investigating the popularity of each domain, by looking at the number of games in which it appears, reveals that while strategy games are the most highly rated, they are second to wargames in popularity (Figure 10).

Among all domains, children's games have the lower average rating. To better comprehend our results, it would be informative to have demographic data including age of users. For instance, it is not unlikely that BGG users are more mature, and therefore dislike children's games.

Figure 9:

Figure 10:

Average Rating by Mechanics Popularity

Exploring the link between board games mechanics and average rating resulted in ambiguous results. Hence, I decided to try a different approach by examining average rating against mechanics popularity. I use the number of games in which the mechanic appears as a proxy to its popularity. I then divided the observations into four quantiles by their popularity rank. Figure 11 indicates that mechanics with very high popularity level, on average, have lower rating compared to less popular mechanics. This suggests that users are interested in more unique content. Mechanics with very high popularity level include well-known mechanics such as dice rolling, simulation and set collection.

Figure 11:

Data ScienceConclusions

What constitutes highly rated board games?

  • High Complexity: Board games with higher complexity score, on average, have higher rating.
  • Long Play Time: Board games with play time of over 90 minutes play time, on average, have higher rating compared to other play time groups.
  • Popular Domains: The most highly rated domains are Wargames, Thematic Games and Strategy Games.
  • Minimum One Player: Board games designed for minimum one player, on average, have higher rating.
  • Solo Games: Board games designed for one player, on average, have higher rating.
  • Mechanics Popularity: Board games with highly popular mechanics, on average, have lower rating.

Future Work

Sentiment Analysis:

To identify additional factors associated with high rating, I recommend conducting sentiment analysis of BGG board games reviews. Sentiment analysis is contextual mining of text which identifies subjective information in source material. By analyzing board games reviews we may extract meaningful information on the social sentiment of certain board games features and gain a better understanding of why users like/dislike certain games.

About Author

Ayelet Hillel

Data Science Professional with experience in research alongside program management. I am passionate about developing data-driven solutions using statistical methodologies and programming languages including Python and R.
View all posts by Ayelet Hillel >

Leave a Comment

No comments found.

View Posts by Categories

Our Recent Popular Posts

View Posts by Tags

#python #trainwithnycdsa 2019 2020 Revenue 3-points agriculture air quality airbnb airline alcohol Alex Baransky algorithm alumni Alumni Interview Alumni Reviews Alumni Spotlight alumni story Alumnus ames dataset ames housing dataset apartment rent API Application artist aws bank loans beautiful soup Best Bootcamp Best Data Science 2019 Best Data Science Bootcamp Best Data Science Bootcamp 2020 Best Ranked Big Data Book Launch Book-Signing bootcamp Bootcamp Alumni Bootcamp Prep boston safety Bundles cake recipe California Cancer Research capstone car price Career Career Day citibike classic cars classpass clustering Coding Course Demo Course Report covid 19 credit credit card crime frequency crops D3.js data data analysis Data Analyst data analytics data for tripadvisor reviews data science Data Science Academy Data Science Bootcamp Data science jobs Data Science Reviews Data Scientist Data Scientist Jobs data visualization database Deep Learning Demo Day Discount disney dplyr drug data e-commerce economy employee employee burnout employer networking environment feature engineering Finance Financial Data Science fitness studio Flask flight delay gbm Get Hired ggplot2 googleVis H20 Hadoop hallmark holiday movie happiness healthcare frauds higgs boson Hiring hiring partner events Hiring Partners hotels housing housing data housing predictions housing price hy-vee Income Industry Experts Injuries Instructor Blog Instructor Interview insurance italki Job Job Placement Jobs Jon Krohn JP Morgan Chase Kaggle Kickstarter las vegas airport lasso regression Lead Data Scienctist Lead Data Scientist leaflet league linear regression Logistic Regression machine learning Maps market matplotlib Medical Research Meet the team meetup methal health miami beach movie music Napoli NBA netflix Networking neural network Neural networks New Courses NHL nlp NYC NYC Data Science nyc data science academy NYC Open Data nyc property NYCDSA NYCDSA Alumni Online Online Bootcamp Online Training Open Data painter pandas Part-time performance phoenix pollutants Portfolio Development precision measurement prediction Prework Programming public safety PwC python Python Data Analysis python machine learning python scrapy python web scraping python webscraping Python Workshop R R Data Analysis R language R Programming R Shiny r studio R Visualization R Workshop R-bloggers random forest Ranking recommendation recommendation system regression Remote remote data science bootcamp Scrapy scrapy visualization seaborn seafood type Selenium sentiment analysis sentiment classification Shiny Shiny Dashboard Spark Special Special Summer Sports statistics streaming Student Interview Student Showcase SVM Switchup Tableau teachers team team performance TensorFlow Testimonial tf-idf Top Data Science Bootcamp Top manufacturing companies Transfers tweets twitter videos visualization wallstreet wallstreetbets web scraping Weekend Course What to expect whiskey whiskeyadvocate wildfire word cloud word2vec XGBoost yelp youtube trending ZORI