Data Analysis on Classpass Fitness Studio
The skills I demoed here can be learned through taking Data Science with Machine Learning bootcamp with NYC Data Science Academy.
Background
Over the past decade, data study shows that the fitness industry has become highly fragmented and localized with the rapid rise of trendy boutique studios, particularly in cities like NYC. Classpass.com started in 2013 as an effort to make boutique fitness easier to navigate for the average consumer looking to try new experiences. The company provides its subscribers easy access to a wide range of fitness classes, gyms, and other wellness offerings. Altogether, the businesses on Classpass represent a good snapshot of the local fitness industry.
As with all other areas, the pandemic brought about immense change for the gym industry. As a result of widespread lockdowns, there was sudden and rapid growth of virtual workouts. This trend is illustrated in the plot below, sourced from Mindbody.
Objectives
For this project, I used Classpass data to gain a better understanding of local fitness trends, with a focus on NYC in particular. These insights would be helpful for my current position at a corporate fitness center as somewhat of a competitive analysis. I used Selenium to scrape studio data from New York, Miami, Atlanta, and San Francisco. These represent the top four healthiest cities according to the 2021 Mindbody Wellness Index. Since NYC was ranked fourth, I was interested in comparing the cities ranked higher to identify any major differences as areas of opportunity.
Additionally, I sought out to learn more about if and how traditional fitness studios had adapted to the new livestream trend. Did studios with livestream offerings rate higher than those without, and were they more popular? I also wanted to find out what types of classes were the most popular, and if there were any differences for livestreams.
Top 4 City Comparison
After scraping studio information from all four cities, my first step was to perform exploratory data analysis to get an idea of the combined dataset I was working with. The plot below shows the number of Classpass studios by city.
Since NYC is Classpass's founding city and also the most populous by a wide margin, it makes sense that it has the most studios. Logically, San Francisco has the next highest number of studios as the second most populous. In comparison, Atlanta and Miami have surprisingly small sample sizes of studios.
Average Ratings Data
Despite the differences in studio counts, studio ratings across all four cities are uniformly high. These ratings can be valuable for corporate fitness clients pursuing local studio partnerships. Since the bulk of ratings average well above 4.5/5.0, it's unfortunately difficult to use them to draw any insightful conclusions. The only major takeaway is just that a rating less than ~4.5 is a huge indicator of relative low quality.
Next, I took a look at what proportion of studios in each city had added virtual classes to its offerings. This data was plotted one year into the pandemic.
My initial impression from the plot is that "Zoom fitness" is still a relatively new concept with room to grow. Interestingly, Miami and NYC have the highest adoption rates, with close to 16% of their studios offering livestreams. Miami also has the lowest studio count yet is ranked the #1 healthiest city by Mindbody (referenced in the intro), which could prove that its businesses tend to be highly innovative.
Another factor which could have heavily influenced livestream adoption rates is business necessity determined by the city-by-city timing of pandemic lockdowns. It is difficult to trace back the timing of these restrictions to verify this relationship, but important to keep in mind.
Livestream Insights Data
Next, I set out to determine whether studios offering livestream classes could be considered more popular or higher performing by any available metric. The following plot shows the number of ratings for studios with and without livestreams.
A large proportion of studios with livestreaming are in the bucket with the most ratings. This could be an indication of popularity, implying that studios offering livestreams tend to be rated and reviewed more frequently than those without.
Note that having these buckets makes it challenging to make comparisons, especially with the ambiguity of the 5000+ range (are there any outstanding studios with much higher than 5,000 reviews? We can't tell from the available data). I will discuss possibilities for further analysis later in the post.
The next plot compares average studio ratings in studios with and without livestream services. The plot shows no significant difference in average ratings of studios, despite the difference in popularity established previously.
Analyzing Studio Tags Data
To round out my analysis, I compiled a list of tags associated with each studio which identifies the type of workout offered. To start, I plotted the top 20 tags in NYC. I then highlighted tags related to livestreams.
Interestingly, strength training is by far the most popular offering on Classpass. However, its only the fourth most popular livestream. This is a major area of opportunity for studios, as there is industry-wide room for growth. Livestream dance is also notably absent from this list and would be another easy addition to virtual offerings.
I also plotted Miami, Atlanta, and San Francisco tags and got similar results. One notable addition on this list is livestream barre, which is surprisingly uncommon in NYC.
Key Takeaways
Altogether, here are the major insights and recommendations for fitness establishments:
- City rankings aside, NYC has a strong and remarkably crowded network of fitness studios. These businesses have a wide range of offerings, and some have been quick to adapt to the times with livestream services.
- Livestream adoption rates for studios are still low, at 7-15 percent. Even with changing lockdown restrictions, it would make sense for studios to invest heavily in improving livestream offerings. Livestreams can supplement in-person classes and go a long way towards helping a business stay relevant in a competitive market.
- Workout trends across the four healthiest cities are very similar. Currently, the largest opportunities for growth are in livestream strength training and livestream dance.
Further Analysis
First, because of bucketed ratings and uniformly high reviews, it’s challenging to determine which Classpass studios are the highest performing. We can gain better insight by referencing reviews on Yelp or Google Reviews and performing sentiment analysis on user reviews.
Another interesting source for competitive insights would be livestream-only workout services, such as apps like Peloton. These services have grown substantially throughout the pandemic and it would be valuable to learn more about their offerings.
Finally, we can get a more granular perspective of the data by analyzing details on individual class offerings. For example, determining which class times, class durations, and pricing models perform best would be extremely useful for the day-to-day business operations. There would most likely be major differences between livestream and in-person class trends.
As the fitness industry continues to evolve, technology integration is sure to increase, and a growing amount of data will be available as a result. Evidently, there are countless opportunities to learn from this data in the future. The insights uncovered will help business operations immensely and support the growth and success of the industry.