Where Can CVS Expand New Stores?

Posted on Aug 3, 2020

For this project, CVS location and US population data were analyzed to suggest where CVS should consider opening new stores.


CVS pharmacy is the largest pharmacies in the United Sates, with almost 10,000 locations as of August, 2020. Last year they reported $256B in revenue, and were listed as #5 on the Fortune 500 list.

According to CVS's SEC filing, their major sources of income are pharmacy and in-store sales. In addition, the majority of the pharmacy sales are within network, as opposed to online or mail-order. Therefore, the vast majority of revenue for CVS comes from people physically walking into the store.

If CVS wants to continue growing, the easiest way seems to be opening new stores. However, CVS is challenged to open new stores, because they already operate in so many locations. Therefore, CVS should target areas with higher demand when opening stores.

Data Collection

In this project, the location and count of every CVS store in the United States were scraped from the CVS website1. Store location data was collect and processed with the Python BeautifulSoup package.

Population data was downloaded from the US Census website, and population estimates for 2019 we used for all states (including District of Columbia and Puerto Rico) and for all cities > 50,000 residents.

These two datasets were combined using Python Pandas to generate a data frame of CVS location by state and city population:

Data Analysis

This data was graphed using Python Matplotlib to show the number of stores per capita. A linear trendline was calculated, and the slope was approximately 1 CVS store per 33,000 people. This trend was the same for both cities and states.

These states/cities can be grouped by whether they fall above or below the trendline (average number of stores per capita). The regions in green are where there are more than 33,000 stores per person, and the regions in red are where there are less.

Using this framework, we can assume that locations with below average CVS counts have a higher demand for new CVS stores. We can also rank the highest and lowest locations based on demand by population:

We can see that states like Washington and Colorado have a higher demand while states like Florida and Massachusetts have a lower demand. This could be do to the difference in ages between the states. Florida and Massachusetts have older populations, who would need more prescriptions than younger populations.

Looking at the cities, New York City and Los Angeles have a higher demand while Miami and Washington DC have a lower demand. This could be due to the high cost of real estate in NYC and LA, where the operating costs might hurt the profitability of a store.


We can cross-reference the city and state information to form a strategy about where CVS should expand.

If a state has a high demand for CVS and also has cities with a high demand, then CVS should make a larger presence in those cities. However, if a state has a high demand for CVS, but low demand in cities, then CVS should try expanding to new cities in that state.


In conclusion, if CVS wants to continue adding new stores, they could focus on expanding their presence in these cities where demand is high:

  • Seattle, WA
  • Denver, CO
  • Portland, OR
  • San Juan, PR

Alternatively, they could add stores in these states where demand is high by expanding to new towns:

  • Wisconsin
  • Arkansas
  • Utah
  • Iowa

Future Work

This analysis could be expanded by adding more demographic data, such as age and income levels, to better predict where a store would be profitable. In addition, competitor data from walgreens, walmart, rite aid, etc. could be scraped from the web to look for areas with less competition.


1 Data was scraped from the CVS Store Locator online. The CVS robots.txt file was reviewed beforehand and the store locator pages were allowed.

About Author

Stephen Kita

Stephen is a biomedical engineer who likes to work with data and develop innovative healthcare products. He is an excellent problem-solver with a diverse background in entrepreneurship.
View all posts by Stephen Kita >

Leave a Comment

No comments found.

View Posts by Categories

Our Recent Popular Posts

View Posts by Tags

#python #trainwithnycdsa 2019 airbnb Alex Baransky alumni Alumni Interview Alumni Reviews Alumni Spotlight alumni story Alumnus API Application artist aws beautiful soup Best Bootcamp Best Data Science 2019 Best Data Science Bootcamp Best Data Science Bootcamp 2020 Best Ranked Big Data Book Launch Book-Signing bootcamp Bootcamp Alumni Bootcamp Prep Bundles California Cancer Research capstone Career Career Day citibike clustering Coding Course Demo Course Report D3.js data Data Analyst data science Data Science Academy Data Science Bootcamp Data science jobs Data Science Reviews Data Scientist Data Scientist Jobs data visualization Deep Learning Demo Day Discount dplyr employer networking feature engineering Finance Financial Data Science Flask gbm Get Hired ggplot2 googleVis Hadoop higgs boson Hiring hiring partner events Hiring Partners Industry Experts Instructor Blog Instructor Interview Job Job Placement Jobs Jon Krohn JP Morgan Chase Kaggle Kickstarter lasso regression Lead Data Scienctist Lead Data Scientist leaflet linear regression Logistic Regression machine learning Maps matplotlib Medical Research Meet the team meetup Networking neural network Neural networks New Courses nlp NYC NYC Data Science nyc data science academy NYC Open Data NYCDSA NYCDSA Alumni Online Online Bootcamp Online Training Open Data painter pandas Part-time Portfolio Development prediction Prework Programming PwC python Python Data Analysis python machine learning python scrapy python web scraping python webscraping Python Workshop R R Data Analysis R language R Programming R Shiny r studio R Visualization R Workshop R-bloggers random forest Ranking recommendation recommendation system regression Remote remote data science bootcamp Scrapy scrapy visualization seaborn Selenium sentiment analysis Shiny Shiny Dashboard Spark Special Special Summer Sports statistics streaming Student Interview Student Showcase SVM Switchup Tableau team TensorFlow Testimonial tf-idf Top Data Science Bootcamp twitter visualization web scraping Weekend Course What to expect word cloud word2vec XGBoost yelp