Locations, Industries & Years: A Deep Dive into Unicorn Companies

Posted on Nov 22, 2022


We keep hearing the term “unicorn companies” in news articles and from well-known investors. It is an investment term coined by a U.S venture capitalist in 2013. The definition is straightforward, “a private start-up company that has a valuation of $1 billion or more”. But implicitly, they usually represent the following characteristics that a company possesses:

  • First movers that redefined their corresponding industries
  • Disruptive and innovative
  • Fast-growing
  • Typically big-time investors

So among all the startups in the world, they are considered rare and mythical, and the capital market is obsessed with them.

This study aims to analyze and understand unicorn companies, and answer the following questions,

  • What are the unicorn companies & where are they based?
  • When did they become a “unicorn”?
  • What interesting insights can be revealed from data analysis


Data & Data Pre-processing

The data used for analysis is a Kaggle dataset originally scraped from CB Insights. It covers all the current unicorn companies in the world (as of April 2022), which included 1074 companies and 10 features- company name, valuation, date joined, industry, city, country, continent, year founded, funding, and select investors.

Major data cleaning procedures applied to the dataset include cleaning up column names, converting data types, checking for outliers, duplicates and errors, and handling missing values.

New features are created for analysis, such as the age of the company, the number of years for a newly founded company to become a unicorn, etc.



Feature I - Location

The top 3 continents are North America, Asia, and Europe, and Europe has the highest number of countries with a unicorn. This aligns with the overall conditions of capital market and industry innovations in these regions.

The United States and China are the Top 2 countries in terms of the number of unicorns, accounting for 52% and 16% respectively.

Among the Top 20 cities, 8 are from the United States. Bay Area, New York, Chicago and Boston are among the top areas across the state. 4 from China, and they are the most developed cities on the Eastern coast.

The US and China are again leading in the overall development as they have multiple startup hubs across the nation.


Feature II- Industry

The Top 4 industries are

  • Fintech
  • Internet software & services
  • e-commerce & DTC
  • Artificial intelligence

They are all tech subsectors, and some other smaller industries like hardware, mobile and ed-tech are also tech-related. For other traditional industries, it can be found that many large unicorns in these sectors are also powered by innovative technology, e.g. Instacart in Supply Chain, logistics & delivery sector and Yanolja from Travel sector.

Feature III Valuation

The following analysis is on distributions of company valuation by industry. There exist extremely highly-valued companies in some industries like ByteDance in AI and SpaceX in Others. Both are over $100 billion in valuation.

Disregarding these “centicorns”, the average valuations of unicorn companies range from $1-2 billion across different industries. No significant difference among different industries is observed.

Feature IV – Years

Diving into calendar years and the number of years for a company to become a unicorn, two patterns are captured,

  1. The average number of years for a company to become a unicorn since being founded is 6-7 years.
  2. Counting new joiners by calendar year, there were three waves of unicorn boom, in 2015, 2018, and 2021 respectively. Most significantly in 2021, 520 companies (from 38 countries) joined the global unicorn club, a 381% increase from the previous year.


Further Exploration

Upon the above analysis of different features, the contrast between the US and China and unicorn booms are compelling for further exploration. Specifically, the research questions are,

1. As the US and China are the top two countries with the most unicorn companies, how do they differ in terms of industry, recent trends, and valuation?

2. What happened in the years of the unicorn boom?

United States v.s. China


In most industries, the US has more unicorn companies than China.

Focusing on the composition of industries in each country, the U.S. unicorn companies are relatively concentrated in Fintech & Internet software & services. Whereas Chinese unicorns are more dispersed in different industries. The top industry is e-commerce & DTC. And the number of companies in other major industries is at a similar level.

Recent trend

Regarding recent trends, 2021 was a big year for unicorn companies both in China and the US. There was a major hike in the U.S. in 2021. While in China, the number of new joiners had decreased since 2018, but the downward trend was reversed in 2021.

The industry composition was also quite different between the two countries. The pattern aligns with the industry comparison concluded in the previous section.


In terms of company valuations, disregarding outliers with extremely high values, it can be deduced that there is no significant difference in average valuations across industries, with an average ranging from [$1-2 billion]. The U.S. average is higher in all sectors than China, except the consumer & retail industry.


As for the mega-unicorns, they drive up the average valuations in these industries.

This include JUUL Labs, SpaceX & TripActions in the US and ByteDance, Shein & Genki Forest in China.


Unicorn Boom

Recall from the previous observations, there were 3 unicorn booms in 2015, 2018 and 2021. To explore what happened in those years, the number of years for a company to become a unicorn (“unicorn age”) is averaged for each year. The unicorn age was relatively shorter in boom years, and this corresponded to the larger number of new entries in those years.

Based on external research, the abundance of capital in VC/PE market in those years pushed up the valuations of startup companies and contributed to the phenomenon.

By comparison, the top contributors for each boom were different- E-commerce & DTC in 2015, Internet software & services in 2018, and Fintech & Internet software & services in 2021.


Next Step

Selected investors of each unicorn company were not covered in the analysis. Without considering the brand names of investors, the number of investors alone is no direct indication of how successful or how big a unicorn is. Also classifying different investing entities belonging to the same investor (e.g. Sequoia China & Sequoia India) demands extensive business knowledge and qualitative judgment. Further exploration could combine more investment insights to expand the analysis on investors.

Another possible expansion would be to include archived unicorn lists of previous years. This would reveal the dynamics and trends of unicorn companies, such as the change in company valuation over time and relationships between the number of exits and market conditions.

About Author

Related Articles

Leave a Comment

No comments found.

View Posts by Categories

Our Recent Popular Posts

View Posts by Tags

#python #trainwithnycdsa 2019 2020 Revenue 3-points agriculture air quality airbnb airline alcohol Alex Baransky algorithm alumni Alumni Interview Alumni Reviews Alumni Spotlight alumni story Alumnus ames dataset ames housing dataset apartment rent API Application artist aws bank loans beautiful soup Best Bootcamp Best Data Science 2019 Best Data Science Bootcamp Best Data Science Bootcamp 2020 Best Ranked Big Data Book Launch Book-Signing bootcamp Bootcamp Alumni Bootcamp Prep boston safety Bundles cake recipe California Cancer Research capstone car price Career Career Day citibike classic cars classpass clustering Coding Course Demo Course Report covid 19 credit credit card crime frequency crops D3.js data data analysis Data Analyst data analytics data for tripadvisor reviews data science Data Science Academy Data Science Bootcamp Data science jobs Data Science Reviews Data Scientist Data Scientist Jobs data visualization database Deep Learning Demo Day Discount disney dplyr drug data e-commerce economy employee employee burnout employer networking environment feature engineering Finance Financial Data Science fitness studio Flask flight delay gbm Get Hired ggplot2 googleVis H20 Hadoop hallmark holiday movie happiness healthcare frauds higgs boson Hiring hiring partner events Hiring Partners hotels housing housing data housing predictions housing price hy-vee Income Industry Experts Injuries Instructor Blog Instructor Interview insurance italki Job Job Placement Jobs Jon Krohn JP Morgan Chase Kaggle Kickstarter las vegas airport lasso regression Lead Data Scienctist Lead Data Scientist leaflet league linear regression Logistic Regression machine learning Maps market matplotlib Medical Research Meet the team meetup methal health miami beach movie music Napoli NBA netflix Networking neural network Neural networks New Courses NHL nlp NYC NYC Data Science nyc data science academy NYC Open Data nyc property NYCDSA NYCDSA Alumni Online Online Bootcamp Online Training Open Data painter pandas Part-time performance phoenix pollutants Portfolio Development precision measurement prediction Prework Programming public safety PwC python Python Data Analysis python machine learning python scrapy python web scraping python webscraping Python Workshop R R Data Analysis R language R Programming R Shiny r studio R Visualization R Workshop R-bloggers random forest Ranking recommendation recommendation system regression Remote remote data science bootcamp Scrapy scrapy visualization seaborn seafood type Selenium sentiment analysis sentiment classification Shiny Shiny Dashboard Spark Special Special Summer Sports statistics streaming Student Interview Student Showcase SVM Switchup Tableau teachers team team performance TensorFlow Testimonial tf-idf Top Data Science Bootcamp Top manufacturing companies Transfers tweets twitter videos visualization wallstreet wallstreetbets web scraping Weekend Course What to expect whiskey whiskeyadvocate wildfire word cloud word2vec XGBoost yelp youtube trending ZORI