Data Analysis on Classpass Fitness Studio

Posted on Jul 22, 2021
The skills I demoed here can be learned through taking Data Science with Machine Learning bootcamp with NYC Data Science Academy.


Over the past decade, data study shows that the fitness industry has become highly fragmented and localized with the rapid rise of trendy boutique studios, particularly in cities like NYC. started in 2013 as an effort to make boutique fitness easier to navigate for the average consumer looking to try new experiences. The company provides its subscribers easy access to a wide range of fitness classes, gyms, and other wellness offerings. Altogether, the businesses on Classpass represent a good snapshot of the local fitness industry.

As with all other areas, the pandemic brought about immense change for the gym industry. As a result of widespread lockdowns, there was sudden and rapid growth of virtual workouts. This trend is illustrated in the plot below, sourced from Mindbody.

Data Analysis on Classpass Fitness Studio


For this project, I used Classpass data to gain a better understanding of local fitness trends, with a focus on NYC in particular. These insights would be helpful for my current position at a corporate fitness center as somewhat of a competitive analysis. I used Selenium to scrape studio data from New York, Miami, Atlanta, and San Francisco. These represent the top four healthiest cities according to the 2021 Mindbody Wellness Index. Since NYC was ranked fourth, I was interested in comparing the cities ranked higher to identify any major differences as areas of opportunity.

Additionally, I sought out to learn more about if and how traditional fitness studios had adapted to the new livestream trend. Did studios with livestream offerings rate higher than those without, and were they more popular? I also wanted to find out what types of classes were the most popular, and if there were any differences for livestreams.

Top 4 City Comparison

After scraping studio information from all four cities, my first step was to perform exploratory data analysis to get an idea of the combined dataset I was working with. The plot below shows the number of Classpass studios by city.

Data Analysis on Classpass Fitness Studio

Since NYC is Classpass's founding city and also the most populous by a wide margin, it makes sense that it has the most studios. Logically, San Francisco has the next highest number of studios as the second most populous. In comparison, Atlanta and Miami have surprisingly small sample sizes of studios.

Average Ratings Data

Data Analysis on Classpass Fitness Studio

Despite the differences in studio counts, studio ratings across all four cities are uniformly high. These ratings can be valuable for corporate fitness clients pursuing local studio partnerships. Since the bulk of ratings average well above 4.5/5.0, it's unfortunately difficult to use them to draw any insightful conclusions. The only major takeaway is just that a rating less than ~4.5 is a huge indicator of relative low quality.

Next, I took a look at what proportion of studios in each city had added virtual classes to its offerings. This data was plotted one year into the pandemic.

Data Analysis on Classpass Fitness Studio

My initial impression from the plot is that "Zoom fitness" is still a relatively new concept with room to grow. Interestingly, Miami and NYC have the highest adoption rates, with close to 16% of their studios offering livestreams. Miami also has the lowest studio count yet is ranked the #1 healthiest city by Mindbody (referenced in the intro), which could prove that its businesses tend to be highly innovative.

Another factor which could have heavily influenced livestream adoption rates is business necessity determined by the city-by-city timing of pandemic lockdowns. It is difficult to trace back the timing of these restrictions to verify this relationship, but important to keep in mind.

Livestream Insights Data

Next, I set out to determine whether studios offering livestream classes could be considered more popular or higher performing by any available metric. The following plot shows the number of ratings for studios with and without livestreams.

A large proportion of studios with livestreaming are in the bucket with the most ratings. This could be an indication of popularity, implying that studios offering livestreams tend to be rated and reviewed more frequently than those without.

Note that having these buckets makes it challenging to make comparisons, especially with the ambiguity of the 5000+ range (are there any outstanding studios with much higher than 5,000 reviews? We can't tell from the available data). I will discuss possibilities for further analysis later in the post.

The next plot compares average studio ratings in studios with and without livestream services. The plot shows no significant difference in average ratings of studios, despite the difference in popularity established previously.

Analyzing Studio Tags DataΒ 

To round out my analysis, I compiled a list of tags associated with each studio which identifies the type of workout offered. To start, I plotted the top 20 tags in NYC. I then highlighted tags related to livestreams.

Interestingly, strength training is by far the most popular offering on Classpass. However, its only the fourth most popular livestream. This is a major area of opportunity for studios, as there is industry-wide room for growth. Livestream dance is also notably absent from this list and would be another easy addition to virtual offerings.

I also plotted Miami, Atlanta, and San Francisco tags and got similar results. One notable addition on this list is livestream barre, which is surprisingly uncommon in NYC.

Key Takeaways

Altogether, here are the major insights and recommendations for fitness establishments:

  • City rankings aside, NYC has a strong and remarkably crowded network of fitness studios. These businesses have a wide range of offerings, and some have been quick to adapt to the times with livestream services.
  • Livestream adoption rates for studios are still low, at 7-15 percent. Even with changing lockdown restrictions, it would make sense for studios to invest heavily in improving livestream offerings. Livestreams can supplement in-person classes and go a long way towards helping a business stay relevant in a competitive market.
  • Workout trends across the four healthiest cities are very similar. Currently, the largest opportunities for growth are in livestream strength training and livestream dance.

Further Analysis

First, because of bucketed ratings and uniformly high reviews, it’s challenging to determine which Classpass studios are the highest performing. We can gain better insight by referencing reviews on Yelp or Google Reviews and performing sentiment analysis on user reviews.

Another interesting source for competitive insights would be livestream-only workout services, such as apps like Peloton. These services have grown substantially throughout the pandemic and it would be valuable to learn more about their offerings.

Finally, we can get a more granular perspective of the data by analyzing details on individual class offerings. For example, determining which class times, class durations, and pricing models perform best would be extremely useful for the day-to-day business operations. There would most likely be major differences between livestream and in-person class trends.

As the fitness industry continues to evolve, technology integration is sure to increase, and a growing amount of data will be available as a result. Evidently, there are countless opportunities to learn from this data in the future. The insights uncovered will help business operations immensely and support the growth and success of the industry.

Github Repository | LinkedIn

About Author

Related Articles

Leave a Comment

No comments found.

View Posts by Categories

Our Recent Popular Posts

View Posts by Tags

#python #trainwithnycdsa 2019 2020 Revenue 3-points agriculture air quality airbnb airline alcohol Alex Baransky algorithm alumni Alumni Interview Alumni Reviews Alumni Spotlight alumni story Alumnus ames dataset ames housing dataset apartment rent API Application artist aws bank loans beautiful soup Best Bootcamp Best Data Science 2019 Best Data Science Bootcamp Best Data Science Bootcamp 2020 Best Ranked Big Data Book Launch Book-Signing bootcamp Bootcamp Alumni Bootcamp Prep boston safety Bundles cake recipe California Cancer Research capstone car price Career Career Day citibike classic cars classpass clustering Coding Course Demo Course Report covid 19 credit credit card crime frequency crops D3.js data data analysis Data Analyst data analytics data for tripadvisor reviews data science Data Science Academy Data Science Bootcamp Data science jobs Data Science Reviews Data Scientist Data Scientist Jobs data visualization database Deep Learning Demo Day Discount disney dplyr drug data e-commerce economy employee employee burnout employer networking environment feature engineering Finance Financial Data Science fitness studio Flask flight delay gbm Get Hired ggplot2 googleVis H20 Hadoop hallmark holiday movie happiness healthcare frauds higgs boson Hiring hiring partner events Hiring Partners hotels housing housing data housing predictions housing price hy-vee Income Industry Experts Injuries Instructor Blog Instructor Interview insurance italki Job Job Placement Jobs Jon Krohn JP Morgan Chase Kaggle Kickstarter las vegas airport lasso regression Lead Data Scienctist Lead Data Scientist leaflet league linear regression Logistic Regression machine learning Maps market matplotlib Medical Research Meet the team meetup methal health miami beach movie music Napoli NBA netflix Networking neural network Neural networks New Courses NHL nlp NYC NYC Data Science nyc data science academy NYC Open Data nyc property NYCDSA NYCDSA Alumni Online Online Bootcamp Online Training Open Data painter pandas Part-time performance phoenix pollutants Portfolio Development precision measurement prediction Prework Programming public safety PwC python Python Data Analysis python machine learning python scrapy python web scraping python webscraping Python Workshop R R Data Analysis R language R Programming R Shiny r studio R Visualization R Workshop R-bloggers random forest Ranking recommendation recommendation system regression Remote remote data science bootcamp Scrapy scrapy visualization seaborn seafood type Selenium sentiment analysis sentiment classification Shiny Shiny Dashboard Spark Special Special Summer Sports statistics streaming Student Interview Student Showcase SVM Switchup Tableau teachers team team performance TensorFlow Testimonial tf-idf Top Data Science Bootcamp Top manufacturing companies Transfers tweets twitter videos visualization wallstreet wallstreetbets web scraping Weekend Course What to expect whiskey whiskeyadvocate wildfire word cloud word2vec XGBoost yelp youtube trending ZORI