Data Analysis on Post-COVID Air Travels
The skills I demoed here can be learned through taking Data Science with Machine Learning bootcamp with NYC Data Science Academy.
Github Repository | LinkedIn: Ryan Kniewel, James Welch
Before We Start
Turning large datasets into actionable insights is a core feature of any data scientist’s job. For this quarter’s hackathon at the New York Data Science Academy we were provided a dataset consisting of flights curated from a Kaggle submission. Our prompt was to consider the strengths and weaknesses of four major airlines and present a data-driven strategy for them to adopt.
After a session of brainstorming, we decided to focus on long and short term growth through the lens of changes in trends driven by COVID-19. We worked over the course of a week and put together our presentation for the judges. Our findings are presented below.
Introduction
The airline industry is at an inflection point. After a year of pandemic restrictions, hope is on the horizon and the industry is primed to accelerate out of this slump.
The COVID-19 pandemic dramatically reduced airline travel due to travel restrictions, lockdowns, and initial safety concerns. A year after the first cases in the US, travel volume is still reduced 60%. While the Airlines are far from the only industry impacted, they are certainly near the top of the list. Business travel, which formerly accounted for just 12% of travelers but 75% of airline profits, has been decimated by the pandemic. Many routine business trips have likely been permanently displaced by remote work and online meetings.
But there is hope on the horizon, safety measures, therapeutics, and vaccines have provided hope that the end of the pandemic is in sight. As air travel resumes, airlines are in a position to accelerate into the future.
Our analysis encompasses three key themes:
(1) Identifying cities and airports which are undervalued in the old business model, and where an increased presence can yield great returns.
(2) Leverage growing vacation travel demand to gain market share and provide a counter-cyclic profit driver.
(3) The utilization of coupons as a nudge to promote user behavior.
First Impressions
Our dataset was sourced from the US Bureau of Transportation Statistics and was curated for only domestic flights, which includes all 50 states, Puerto Rico and the US Virgin Islands. This dataset contained ticket information such as the purchase price, number purchased and how many coupons were available for the flight. The tickets were linked to specific flights with origin and destination airports, and the distance traveled.
Finally, included metadata provided the purchase code, the calendar year quarter, and the airline. This initial dataset was a starting point, but to enhance our analysis we integrated additional sources of information.
To understand the changing population dynamics of cities we used US census data for the decade of 2010 to 2019 and a more specific analysis from 2018-2019. Additionally, to understand COVID-19 specific trends we integrated an analysis by McKinsey Global Institute that compiled changes in LinkedIn location. We also included a vacation destination trend analysis from Forbes and RENTCafé and incorporated data of hub and focus airports for each airline. By bringing all these data sources together we assembled a picture of how the airline industry was optimized in the past and how we propose its reorganization for the future.
Opportunities in Growing Cities
Analysis of route networks indicated that three of four airlines utilized a hub system (American, United and Delta; Figure 1). This hub system serves outlying airports and from there passengers transfer to their final destination or at most to another hub and then on to their final destination. This model works well for concentrating infrastructure and costs to one location and has been in use for decades to well serve business and leisure travelers.
Southwest Airlines, on the other hand, has a different business model that is evident from its highly distributed route network. It specializes in middle-distance routes and provides more connections between a larger range of mid-size cities than its competitors.
The existing route system is optimized to the utmost level, and has provided transport for millions of passengers a year, but like many optimizations it isn't easy to change when conditions shift.
Data Plots and Graphs
Figure 2: Relationship between metro area population, flight volume and airline hub locations.
Hub airports are indicated by black points.
Most cities where hubs are located are the largest, most dynamic cities in the United States, however that is not necessarily true anymore (Figure 2). Cities like Detroit and Chicago have large metropolitan areas, serve as hub airports for the upper Midwest, but have experienced zero or negative changes in population in the 2010s (Figure 3).
Figure 3: Relationship between population increase in the 2010s and flight volume
Hub airports are indicated by black points.
Analysis of the route network for airports closest to the top 15 fastest growing cities in 2018-2019 reveals that each airline is differently positioned to service these fast growing regions (Figure 5).
Figure 5: Airline route networks serving the top 15 high-growth regions for 2018-2019
Hub airports are indicated by green points, other airports are blue. Airports closest to the top 15 fastest growing areas for 2018-2019 are designated by their three letter IATA code. Route passenger volume is indicated by increased line widths and color depths.
One metro area that is rapidly gaining population and is prominent in both the Census and LinkedIn growth data is around Austin, Texas.
Data Case Study: Austin
Over the past decade, the Austin metro area which includes Georgetown and Round Rock has grown at an astonishing 30%. Not only is the whole metro area growing, Austin is surrounded by some of the fastest growing cities in recent years: Leander, Georgetown, and New Braunfels.The Austin airport, AUS, is one of the largest in the area and services 13,000 passengers with 246 routes (2018 data).
An area for disruption that our analysis revealed is the many routes that are predominantly serviced by one airline, even between populous cities. In our dataset, Southwest is the dominant airline serving Austin, yet competes heavily for many routes with United and American. One in particular is for the Austin-Chicago route.
Considering that Chicago O’Hare is a hub for both United and American these two airlines are positioned to take market share from their dominant competitor, Southwest. On the flip side, Southwest could work to fortify its foothold in Austin, which could be through increasing the range of routes servicing AUS, increasing brand visibility, and working to build customer loyalty.
However AUS isn’t the only airport serving this cluster of metro areas. San Antonio (SAT) is only a 90 minute drive from Austin, while New Braunfels is even closer. The city has been growing at half the rate of the Austin metro area but as their metro area continues to grow the difference between AUS and SAT for that population will come down to choice rather than proximity.
This leads to the intuition that there may be many other underserved airports located in cities with large and/or growing populations where one airline provides most of the flights, so we attempted to enumerate that insight.
High Potential Cities Data
The list above was generated by analyzing all the cities provided in our 2018 flights dataset. We linked that to 2019 population data and the change in population over the 2010s. Then we looked at how many routes each carrier provided for that airport. We set a cut-off for population change greater than 10%, and where a single airline accounted for more than 50% of the routes serving each airport.
The utility and forward-looking model of Southwest’s distributed network is again obvious in this analysis. Their network is already well positioned to provide a variety of routes serving smaller airports across the nation. Southwest is favorably overrepresented in this analysis due to many high growth cities being located in the southwestern US.
This is a curated list and the cut-offs we used were arbitrary. We think that with more domain knowledge and a broader set of data, more informed cut-offs could be selected.
Ultimately, investing in new hub cities is a costly decision which would require more information, analysis, and expertise. However, we think this analysis can offer a promising strategy for investing in growing cities with an eye toward sustaining the growth opportunities for each airline. Like any investment, hub reorganization would not be expected to pay off immediately but in the current economic climate repositioning to take advantage of growth markets could be fruitful.
Leisure Travel as Opportunity
Besides business travel, leisure is why most customers choose to fly. Occasionally to relocate, more often to visit family, but most commonly for vacation travel. Generally, leisure travel peaks in the summer and winter, but working to increase the volume of holiday travel during non-peak periods could function to offset profit lost from decreased business travel.
Another trend seen recently is the “COVID working vacation” where professionals, taking advantage of work-from-home policies, relocate to smaller cities, rural areas, or vacation destinations. This pattern was illuminated through a recent search trend analysis by RENTCafe.com.
Search Trends
Their analysis discovered that in addition to vacation search volume being down, the type of destinations cited in chatter online was also changing. Keywords representing beaches, natural attractions, and small towns were increasingly being searched. While many of these locations overlap with popular vacation destinations, some airlines are in a better position to take advantage of this trend than others.
Figure 6 reveals the existing route networks serving vacation destinations being searched on the internet. These are airports closest to beaches, natural attractions, and small towns, while also including some popular vacation city destinations (e.g. Los Angeles and Las Vegas).
Overall, three airlines, American, United and Southwest already have diverse routes serving many vacation destinations. With Southwest making up somewhat for its lack of longer Hawaii routes with a high density of flights to southern locales, including Florida, where it has a hub in Orlando (MCO). (Note: Southwest recently added routes to Hawaii which weren't in our dataset from 2018. A great confirmation of our thesis.) Delta, on the other hand, has relatively fewer flights to Caribbean destinations flying predominantly from their hubs in Atlanta and New York.
Figure 6: Airline route networks serving vacation destinations
Hub airports are indicated by green points, other airports are blue. Airports closest to vacation destinations are designated by their three letter IATA code. Route passenger volume is indicated by increased line widths and color depths.
We wanted to specifically look at travel to the Caribbean as another case study. All four airlines fly to Caribbean airports in Puerto Rico, St. Thomas, and St. Croix. This includes 239 routes and served 56,000 passengers in 2018.
Figure 7: Airline route networks serving Caribbean Destinations
Hub airports are indicated by green points, other airports are blue. Airports closest to vacation destinations are designated by their three letter IATA code. Route passenger volume is indicated by increased line widths and color depths.
Increasing service to the Caribbean and other vacation destinations will help airlines take advantage of the pent-up travel demand as the pandemic begins to subside. However, many of these destinations are not only vacation destinations, but many are also fast-growing cities and one we wanted to highlight is Sarasota, Florida.
Combining Our Thesis
Over the past decade the Sarasota (SRQ) metro area has grown nearly 20%. It is an hour drive from Tampa (TPA), another large and high growth city. Sarasota sits on the western coast of Florida and hosts a large beachfront. This allows Sarasota to validate both our theses, that there are many growing cities where airlines can establish a footprint to capture market share while focusing on vacation destinations as locations in need of increased throughput.
Many growth areas are suburbs lying at the edge of, or between, existing metro areas and for most suburbs the difference between TPA and SRQ airports is a slightly longer drive and the choice of flights.
The final question is if consumer behavior will match our predictions, and a feature in our dataset might be leveraged to encourage their behavior.
SWOT Data Analysis
American Airlines | United | Delta | Southwest | |
Strengths |
Hubs in growth cities: PHX, DFW Hubs in vacation destinations: MIA, LAX 3 hub cities serve Hawaii routes |
Hubs in growth cities: IAH, DEN Hubs in vacation destination: LAX Many Hawaii routes |
Hubs in growth city: SLC Hubs in vacation destinations: LAX Many Hawaii routes |
Hubs in growth cities: MCO, HOU, DAL, LAS, DAL, DEN Massive route network |
Weakness |
Less expansive route network with respect to high growth areas and most vacation destinations | Having hubs in northern cities might hinder establishment of new routes in growth areas and vacation destinations | No Hawaii routes as of 2018 (vacation destination) | |
Opportunity |
Exploit growth markets currently only served by one competitor airline Use MIA hub to undercut vacation travel for FL routes |
Exploit growth markets currently only served by one competitor airline | Exploit growth markets currently only served by one competitor airline Since several routes to HI are established, increase marketing/advertising push |
Bolster cornered markets where no route competition exists Use extensive route network to reinforce vacation travel market |
Threat |
Need to maintain market share in grow areas and vacation destinations Heavy FL competition |
Need to maintain market share in grow areas and vacation destinations | Having much route coverage in northern cities may prove detrimental as these are not growth areas (with the exception of SEA and PDX) Very weak route coverage for some vacation areas (Caribbean) |
Exposure to competition in cornered markets in growth areas Heavy FL vacation travel competition |
Changing Behavior with Coupons
The documentation from the Department of Transportation Statistics for the coupon feature was a little suspect. Every fight had at least 1 coupon available, which set off some alarm bells, but we don’t know whether the customer utilized those coupons or what the coupon was used for (Figure 8).
Regardless, we believe coupons are an underutilized resource and embracing them could serve has a way to nudge consumer behavior and respond to consumer sentiment.
Figure 8: Relationship between ticket prices and number of coupons for flights by the four airlines
Only 3% of flights had more than 1 coupon available so there is a lot of room for improvement. Note the boxes above are similarly sized but there are nearly 40 times as many flights in the 1 coupon category compared to 3.
Even with the low number of multi-coupon flights, we can see that there is a broad trend of more coupons being available for more expensive flights. This trend is only strong for United and for other airlines where the difference between 1 and 3 coupon flights is around $50.
While we don’t know how these coupons are applied or even if they are used by the customer, we think they can be leveraged to nudge behavior. The coupon doesn’t necessarily need to reduce the cost of the flight. For many consumers, a drink coupon or a free checked bag can be the difference between booking a flight and delaying the purchase.
The coupons may not need to come from the airlines themselves. The value of tourists is so high that destinations might be willing to partner with airlines, offering compensation for our coupons by nudging vacationers to one destination or another.
Conclusions
Though the current position of airlines is precarious, there is the opportunity for growth and expansion in a post-COVID world. We hope we have shown that smaller, fast-growing cities can be the future of your hub system, that vacation travel presents a reliable opportunity to make up for a lost customer base, and that a better utilization of coupons can help shape demand. The future is in your hands, and with an appropriate strategy you are cleared for take-off.
Our Thoughts
We both had a great time working on this hackathon. It was a lot of work, a couple long nights, and some last-minute adjustments, but we are proud of what we made. Though we only had a week to work on our presentation and our experience with the airline industry is limited to that which we have experienced from only inside the cabin, we found this has been a rich learning opportunity.
Ultimately we wish we had more time, we wanted to delve into predictive models for these routes, dashboards which highlight ‘orphan routes’ only serviced by one provider, and explore the idea of airline companies as aggregators of consumer demand. Lastly, we are just looking forward to the future: learning with NYCDSA, analyzing data, and building. Keep a lookout for our next projects and thank you for reading.
Our Data Sources
Airline Hubs, Codes, Locations
Metropolitan Statistical Area Population, 2010-2019 Population Growth
US Census, 2018-2019 Fastest Growing Cities