NYC Data Science Academy| Blog
Bootcamps
Lifetime Job Support Available Financing Available
Bootcamps
Data Science with Machine Learning Flagship 🏆 Data Analytics Bootcamp Artificial Intelligence Bootcamp New Release 🎉
Free Lesson
Intro to Data Science New Release 🎉
Find Inspiration
Find Alumni with Similar Background
Job Outlook
Occupational Outlook Graduate Outcomes Must See 🔥
Alumni
Success Stories Testimonials Alumni Directory Alumni Exclusive Study Program
Courses
View Bundled Courses
Financing Available
Bootcamp Prep Popular 🔥 Data Science Mastery Data Science Launchpad with Python View AI Courses Generative AI for Everyone New 🎉 Generative AI for Finance New 🎉 Generative AI for Marketing New 🎉
Bundle Up
Learn More and Save More
Combination of data science courses.
View Data Science Courses
Beginner
Introductory Python
Intermediate
Data Science Python: Data Analysis and Visualization Popular 🔥 Data Science R: Data Analysis and Visualization
Advanced
Data Science Python: Machine Learning Popular 🔥 Data Science R: Machine Learning Designing and Implementing Production MLOps New 🎉 Natural Language Processing for Production (NLP) New 🎉
Find Inspiration
Get Course Recommendation Must Try 💎 An Ultimate Guide to Become a Data Scientist
For Companies
For Companies
Corporate Offerings Hiring Partners Candidate Portfolio Hire Our Graduates
Students Work
Students Work
All Posts Capstone Data Visualization Machine Learning Python Projects R Projects
Tutorials
About
About
About Us Accreditation Contact Us Join Us FAQ Webinars Subscription An Ultimate Guide to
Become a Data Scientist
    Login
NYC Data Science Acedemy
Bootcamps
Courses
Students Work
About
Bootcamps
Bootcamps
Data Science with Machine Learning Flagship
Data Analytics Bootcamp
Artificial Intelligence Bootcamp New Release 🎉
Free Lessons
Intro to Data Science New Release 🎉
Find Inspiration
Find Alumni with Similar Background
Job Outlook
Occupational Outlook
Graduate Outcomes Must See 🔥
Alumni
Success Stories
Testimonials
Alumni Directory
Alumni Exclusive Study Program
Courses
Bundles
financing available
View All Bundles
Bootcamp Prep
Data Science Mastery
Data Science Launchpad with Python NEW!
View AI Courses
Generative AI for Everyone
Generative AI for Finance
Generative AI for Marketing
View Data Science Courses
View All Professional Development Courses
Beginner
Introductory Python
Intermediate
Python: Data Analysis and Visualization
R: Data Analysis and Visualization
Advanced
Python: Machine Learning
R: Machine Learning
Designing and Implementing Production MLOps
Natural Language Processing for Production (NLP)
For Companies
Corporate Offerings
Hiring Partners
Candidate Portfolio
Hire Our Graduates
Students Work
All Posts
Capstone
Data Visualization
Machine Learning
Python Projects
R Projects
About
Accreditation
About Us
Contact Us
Join Us
FAQ
Webinars
Subscription
An Ultimate Guide to Become a Data Scientist
Tutorials
Data Analytics
  • Learn Pandas
  • Learn NumPy
  • Learn SciPy
  • Learn Matplotlib
Machine Learning
  • Boosting
  • Random Forest
  • Linear Regression
  • Decision Tree
  • PCA
Interview by Companies
  • JPMC
  • Google
  • Facebook
Artificial Intelligence
  • Learn Generative AI
  • Learn ChatGPT-3.5
  • Learn ChatGPT-4
  • Learn Google Bard
Coding
  • Learn Python
  • Learn SQL
  • Learn MySQL
  • Learn NoSQL
  • Learn PySpark
  • Learn PyTorch
Interview Questions
  • Python Hard
  • R Easy
  • R Hard
  • SQL Easy
  • SQL Hard
  • Python Easy
Data Science Blog > Data Visualization > The Airline Industry's Competition and Monopoly

The Airline Industry's Competition and Monopoly

Chad Loh
Posted on May 11, 2022

The skills the authors demonstrated here can be learned through taking Data Science with Machine Learning bootcamp with NYC Data Science Academy.

Planning for a trip could be stressful. Especially when it includes flying somewhere, finding the right flight ticket could cause some headaches. Is this cheap? Should I try another airline? Should I make a transfer? What are the hidden fees? Travel metasearch engines like Google flights, Kayak, or Expedia can save us time by collecting the search results from different airlines, but the final decision is up to us.

 

✈️ Introduction

There are currently more than 60 US airlines connecting the 50 states. The top 10 airlines make up 90% of the US aviation market, and 51 others make up the other 10%. In this competitive industry, airlines’ goal is routing and pricing the flights in the most efficient and profitable way. This is a very complex problem with a myriad of factors. Not only the demands between the cities, but all other existing routes are also potentially connecting flights that could affect the decision.

Even more, all other existing routes of the competitors factor in. The following map shows the existing flight routes in the US. The circle size represents the number of passengers using each airport in 2021, and the line thickness represents the number of passengers using each route. Different color is assigned to each airline to differentiate the routes. 

 

A graph of the US airline domestic market share in 2021 (American:19.5%, Southwest 17.4%, Delta 16.3%, United 12.9%, etc.)

 

Air Route Map

(click on the image above to see the routes of individual airlines)

 

🎯 Research Question

This project covers an overview of the US airline industry and then identifies and quantifies the effects of competition and monopoly in the airline industry. Airfare can be studied from both customer's perspective and the business perspective. Here are the research questions that each side could have and that this research can potentially answer.

Customer’s perspective

  • What does the air travel route map look like?
  • Is my ticket cheaper than usual?
  • Which airline is the cheapest?
  • What affects the airfare?

Airline's perspective

  • How does the competition affect the ticket price?
  • How should we price our tickets?
  • Which route should we target next?

 

🎫 Airfare Profit for an Airline

As explained above, the pricing of the ticket is one of the most important questions in the airline industry.

First, as an overview, here is an average cost breakdown of a $100 flight ticket. The largest portion is the cost of fuel ($29). Fuel cost is, however, the most fluctuant variable that is difficult to predict and control. This makes the price of the ticket also fluctuate depending on the oil price and the global economy. $11 is used for the maintenance of the aircraft and facilities, $20 is for the salary, $16 is for the assets (aircraft, facilities, offices, etc.), $14 is for the fees and taxes, and $9 is for other expenditures and just $1 for the profit. Therefore, the ticket price includes a large base cost that has no direct relationship with the specific route. We will examine more on this 'base price' later. 

Fuel and part of the salaries depend on the flight distance. The relationship between flight fare and distance is examined here. The map below shows the average fare between city pairs. The available dataset, unfortunately, only contained data from the contiguous 48 states. Therefore, care should be taken when analyzing the data for Alaska and Hawaiian airlines.

As expected, the expensive flights were mostly transcontinental flights. The route with the highest average direct one-way cost was between Boston and San Francisco ($360.06), followed by Jackson, WY - New York ($359.31) and San Francisco - Washington DC ($342.55). On the other hand, the cheaper flights were mostly short-distance flights. The median and mean prices were $182.30 and $193.33, respectively. 

 

📏 Distance vs Price for an Airline

Here, we conduct a linear regression to quantitatively analyze the base fare and cost per distance for each airline. The scatter plot below compares the flight fare against the flight distance. Linear regression was performed on the points, which are grouped by the airline. The linear regression results are shown on the right. β0 is the y-intercept, which can be translated to the base fare that the customer has to pay even if the flight distance is zero. β1 is the cost per distance. For example, the price would increase by $54.63 for every 1,000miles of flights when flying with American airlines. 

Unsurprisingly, low-cost carriers like spirit and frontier have lower prices compared to other airlines as evident from the regression lines located below the others. Regular full-service carriers like American, Delta, and United all had relatively higher prices. 

The regression results are again compared with simple graphs below. The most expensive base fares were from Skywest, American, Delta, Jetblue, and United. While American, Delta, and United were expected to be more expensive since they are full-service airlines, Skywest and Jetblue were not expected to be on this list. For the cost per mile, United, Alaska, Delta, Southwest, and American Airlines were the more expensive airlines. On the cheaper side, Spirit and Frontier had the lowest base fare β0 and cost per mile β1 among the top 10 airlines. 

 

🥊 Competition between airlines

R2 for the price vs distance linear regression was not satisfactory. The R2 value can be improved by fitting with multiple variables - for example, the competition with other airlines. Unlike store-bought products that have fixed prices, flight tickets have variable prices depending on the supply and demand. Therefore, higher competition with other airlines would reduce the price of the ticket.

In the following scatter plot, the points represent the average flight fare and flight distance for each city pair. And the color of the points and the regression line represent the number of airlines flying the same route. As expected, the routes with higher competition (blue line) had lower prices than the routes that were monopolized by one company (red line). The trend can be observed from the scatter plot and the numbers on the right.

β0 and β1 are plotted against the number of competing companies on the chart below. The base price β0 was constant regardless of the number of competing companies. This is because the airline has to pay for the airport fee, facilities, and other expenditures whether there is competition for their routes or not. However, the cost per mile β1 decreased as there was more competition. 

 

🏕️ Hub's effect for an airline

As shown in the previous section, a monopoly on a route can lead to a higher ticket price. Then how does a monopoly of a hub affect the ticket price? 

According to the linear regression result below, the price of Delta and United flights is about $10 more expensive when it uses the hub. The coefficient estimate of American Airlines also indicates that the usage of hub increases the fare, but was not significant according to the P-value. The price increase (or a "hub premium") could be due to the monopoly of these airlines in some hub airports - for example, Charlotte (91% run by American), Atlanta (79% Delta), Dallas DFW (85% American), Miami (75% American), Houston IAH (81%), and Newark (70% United). 

Interestingly, the low-cost carriers had lower airfare when the flight uses a hub. Using a hub decreases the overall operation cost because they can funnel the air traffic through their hubs to reduce the number of operating aircraft. For example, there should be 45 flights to connect 10 destinations point-to-point, while they can be connected with 9 flights if one of the airports is a hub. For Spirit Airlines, the fare is almost $20 cheaper for the same distance when it uses one of its hubs.

 

🧮 Multiple Linear Regression for an Airline

Now we combine all of the previous analyses into a multiple linear regression model. Each airline has a price prediction model with three parameters: distance, hub factor, and competition factor. The regression model is shown in the figure below, where D is the distance in 1,000 miles, H is the hub factor (1 if the route is from/to a hub, 0 if the route doesn't pass the hub), and C is the competing airlines for the route (including itself). We can then predict the average price of the flight with this model. An example is shown below, where the Delta airline fare between New York and Las Vegas is predicted as $264.30. 

From the airline's perspective, we can use this model for pricing a non-existing flight. For example, there are no direct flights between Austin and Las Vegas. The distance is 1,090 miles, neither of the airports is a Delta hub, and there are three other airlines that fly direct between Austin and Las Vegas (thus 4 competing airlines when Delta launches this new route). We use this model to predict the average price to be $193.19.

 

📝 Summary

  • Many factors affect airfare: airline, distance, airport, competition, … 
  • Base airfare, cost per distance, hub discount/premium, and monopoly premium can be calculated for each airline and route
  • A multiple linear regression model was built for predicting the average airfare between two cities
  • Low-cost airlines save money by funneling the traffic through their hubs → price decreases
  • Large full-service airlines like Delta, United, and American monopolize the airport → price increases

 

➡️ What's next for airlines?

The next step would be improving the model with more data. One important piece of information missing in this dataset was the temporal price difference depending on when you are traveling and when you are booking. Ticket price surges along with the increased travel demand around vacation season and national holidays. The travel season also depends on the destination. People look for warmer weather in the winter, and the beaches in the summer.

There are several articles on when to book your tickets. Statistically, it is ideal to book the ticket 1 to 4 months before the trip Link. And there was a myth that Tuesday is the best day to book your ticket, which turns out to be not true anymore Link. Including these temporal data would improve the price prediction model. 

Another interesting subject to study is the rise of low-cost carriers. Southwest airline, the largest low-cost carrier in the US, now has a significant market share that is comparable to traditional airlines, meaning that the pricing model has definitely been successful. Comparing the pricing model and predicting the flight demand of traditional and low-cost carriers could be valuable for future decisions in all airlines. 

Lastly, the effect of the hub is an interesting subject to revisit.  As seen in this research, the hub-and-spoke and point-to-point systems of airlines have a significant effect on the ticket price. 

 

Data Source

  • U.S. Air Carriers Traffic and Capacity Data, Bureau of Transportation Statistics
  • Domestic Airline Consumer Airfare Report, Bureau of Transportation Statistics
  • World Airport Codes, Bureau of Transportation Statistics
  • Airport longitude & latitude data, Ourairports.com
  • How Airlines Spend Your Airfare, The Wall Street Journal
  • Domestic market share of leading U.S. airlines, Statista
  • Cover photo: Photo by Matthew Smith on Unsplash

About Author

Chad Loh

View all posts by Chad Loh >

Related Articles

Capstone
Acquisition Due Dilligence Automation for Smaller Firms
Machine Learning
Beware of Feature Importance for Business Decisions
Meetup
Building a Safer Future
Python
Tech Layoffs: Exploring the Trends and Industry Shifts
Meetup
Analysis of Mass Shootings and Gun Ownership in the United States

Leave a Comment

No comments found.

View Posts by Categories

All Posts 2399 posts
AI 7 posts
AI Agent 2 posts
AI-based hotel recommendation 1 posts
AIForGood 1 posts
Alumni 60 posts
Animated Maps 1 posts
APIs 41 posts
Artificial Intelligence 2 posts
Artificial Intelligence 2 posts
AWS 13 posts
Banking 1 posts
Big Data 50 posts
Branch Analysis 1 posts
Capstone 206 posts
Career Education 7 posts
CLIP 1 posts
Community 72 posts
Congestion Zone 1 posts
Content Recommendation 1 posts
Cosine SImilarity 1 posts
Data Analysis 5 posts
Data Engineering 1 posts
Data Engineering 3 posts
Data Science 7 posts
Data Science News and Sharing 73 posts
Data Visualization 324 posts
Events 5 posts
Featured 37 posts
Function calling 1 posts
FutureTech 1 posts
Generative AI 5 posts
Hadoop 13 posts
Image Classification 1 posts
Innovation 2 posts
Kmeans Cluster 1 posts
LLM 6 posts
Machine Learning 364 posts
Marketing 1 posts
Meetup 144 posts
MLOPs 1 posts
Model Deployment 1 posts
Nagamas69 1 posts
NLP 1 posts
OpenAI 5 posts
OpenNYC Data 1 posts
pySpark 1 posts
Python 16 posts
Python 458 posts
Python data analysis 4 posts
Python Shiny 2 posts
R 404 posts
R Data Analysis 1 posts
R Shiny 560 posts
R Visualization 445 posts
RAG 1 posts
RoBERTa 1 posts
semantic rearch 2 posts
Spark 17 posts
SQL 1 posts
Streamlit 2 posts
Student Works 1687 posts
Tableau 12 posts
TensorFlow 3 posts
Traffic 1 posts
User Preference Modeling 1 posts
Vector database 2 posts
Web Scraping 483 posts
wukong138 1 posts

Our Recent Popular Posts

AI 4 AI: ChatGPT Unifies My Blog Posts
by Vinod Chugani
Dec 18, 2022
Meet Your Machine Learning Mentors: Kyle Gallatin
by Vivian Zhang
Nov 4, 2020
NICU Admissions and CCHD: Predicting Based on Data Analysis
by Paul Lee, Aron Berke, Bee Kim, Bettina Meier and Ira Villar
Jan 7, 2020

View Posts by Tags

#python #trainwithnycdsa 2019 2020 Revenue 3-points agriculture air quality airbnb airline alcohol Alex Baransky algorithm alumni Alumni Interview Alumni Reviews Alumni Spotlight alumni story Alumnus ames dataset ames housing dataset apartment rent API Application artist aws bank loans beautiful soup Best Bootcamp Best Data Science 2019 Best Data Science Bootcamp Best Data Science Bootcamp 2020 Best Ranked Big Data Book Launch Book-Signing bootcamp Bootcamp Alumni Bootcamp Prep boston safety Bundles cake recipe California Cancer Research capstone car price Career Career Day ChatGPT citibike classic cars classpass clustering Coding Course Demo Course Report covid 19 credit credit card crime frequency crops D3.js data data analysis Data Analyst data analytics data for tripadvisor reviews data science Data Science Academy Data Science Bootcamp Data science jobs Data Science Reviews Data Scientist Data Scientist Jobs data visualization database Deep Learning Demo Day Discount disney dplyr drug data e-commerce economy employee employee burnout employer networking environment feature engineering Finance Financial Data Science fitness studio Flask flight delay football gbm Get Hired ggplot2 googleVis H20 Hadoop hallmark holiday movie happiness healthcare frauds higgs boson Hiring hiring partner events Hiring Partners hotels housing housing data housing predictions housing price hy-vee Income industry Industry Experts Injuries Instructor Blog Instructor Interview insurance italki Job Job Placement Jobs Jon Krohn JP Morgan Chase Kaggle Kickstarter las vegas airport lasso regression Lead Data Scienctist Lead Data Scientist leaflet league linear regression Logistic Regression machine learning Maps market matplotlib Medical Research Meet the team meetup methal health miami beach movie music Napoli NBA netflix Networking neural network Neural networks New Courses NHL nlp NYC NYC Data Science nyc data science academy NYC Open Data nyc property NYCDSA NYCDSA Alumni Online Online Bootcamp Online Training Open Data painter pandas Part-time performance phoenix pollutants Portfolio Development precision measurement prediction Prework Programming public safety PwC python Python Data Analysis python machine learning python scrapy python web scraping python webscraping Python Workshop R R Data Analysis R language R Programming R Shiny r studio R Visualization R Workshop R-bloggers random forest Ranking recommendation recommendation system regression Remote remote data science bootcamp Scrapy scrapy visualization seaborn seafood type Selenium sentiment analysis sentiment classification Shiny Shiny Dashboard Spark Special Special Summer Sports statistics streaming Student Interview Student Showcase SVM Switchup Tableau teachers team team performance TensorFlow Testimonial tf-idf Top Data Science Bootcamp Top manufacturing companies Transfers tweets twitter videos visualization wallstreet wallstreetbets web scraping Weekend Course What to expect whiskey whiskeyadvocate wildfire word cloud word2vec XGBoost yelp youtube trending ZORI

NYC Data Science Academy

NYC Data Science Academy teaches data science, trains companies and their employees to better profit from data, excels at big data project consulting, and connects trained Data Scientists to our industry.

NYC Data Science Academy is licensed by New York State Education Department.

Get detailed curriculum information about our
amazing bootcamp!

Please enter a valid email address
Sign up completed. Thank you!

Offerings

  • HOME
  • DATA SCIENCE BOOTCAMP
  • ONLINE DATA SCIENCE BOOTCAMP
  • Professional Development Courses
  • CORPORATE OFFERINGS
  • HIRING PARTNERS
  • About

  • About Us
  • Alumni
  • Blog
  • FAQ
  • Contact Us
  • Refund Policy
  • Join Us
  • SOCIAL MEDIA

    © 2025 NYC Data Science Academy
    All rights reserved. | Site Map
    Privacy Policy | Terms of Service
    Bootcamp Application