Dubai International Airport Flights Analysis - Scraping project

Posted on Feb 19, 2018

Dubai International Airport is the busiest airport of the year! Is it a blessing or a curse?

As people might think that a busy airport and passengers traffic is a good sign about the country, sometimes the case is different. In Dubai, the passengers traffic caused a lot of problems like flights delays, flights cancellations, passengers missing flights..etc. The questions here are: Is it possible to avoid these problems? Do airlines and flight schedules have impact on that?

A flight booking scraping project of was done to analyse the flights arriving to Dubai from 10 other busy airports in the world over a whole month (March 2018). The main target of the project is to know the main factors that are affecting the traffic and if there is a solution for that.

The Data Scraped is a flight booking site that provides the user all the flights options he can have on a certain date and to a specific location. In this project, since it is about analyzing the traffic in Dubai International Airport, the destination was fixed to DXB (Dubai International Airport). As for the Departure, it was changed frequently to get the flights coming from 10 different airports which are: London(LHR), USA(ALT), Thailand(BKK), Singapore(SIN), Germany(FRA), Paris(CDG), Seoul(ICN), Hong Kong(HKG), Amsterdam(AMS), Taiwan(TPE). For each of these airports, the flights information gathered was from March 1, 2018 till March 31, 2018.

Two data sets were considered, the first one is for the total number of flights coming from each airport to Dubai through out the month and it consists of 311 rows. As for the second one, it was the details of each flight and it consists of 26,429 rows.

First Data Set Sample for first 5 rows

Departure Arrival Total Number of Flights Day (March 2018)
London (LHR) Dubai (DXB) 453 1
London (LHR) Dubai (DXB) 522 2
London (LHR) Dubai (DXB) 560 3
London (LHR) Dubai (DXB) 525 4
London (LHR) Dubai (DXB) 566 5


Second Data Set Sample for first 5 rows

One-way Departure Airport Arrival Airport Departure date Class Airlines Departure Airport code
One-way London (LHR) Dubai (DXB) Thu 3/1 Economy Multiple Airlines LHR
One-way London (LHR) Dubai (DXB) Thu 3/1 Economy Multiple Airlines LHR
One-way London (LHR) Dubai (DXB) Thu 3/1 Economy Multiple Airlines LHR
One-way London (LHR) Dubai (DXB) Thu 3/1 Economy Multiple Airlines LHR
One-way London (LHR) Dubai (DXB) Thu 3/1 Economy Multiple Airlines LHR


Departure Time Arrival Airport code Arrival Time Number of extra days Total Flight Duration StopOvers Offer Website Price
8:25 PM DXB (+2) 3:05 AM 0 26h 40m BCN, SAW $337
9:45 AM DXB (+1) 3:05 AM (+1) 13h 20m CGN, SAW $343
6:40 AM DXB (+1) 12:50 AM (+1) 14h 10m ARN $352
10:20 AM DXB (+1) 3:00 AM (+1) 12h 40m OSL, HEL $382
8:05 AM DXB (+1) 3:05 AM (+1) 15h 00m DUS, SAW $389


Visualization and Analysis

The following graphs will answer the following questions:

1. From which airport most of the flights are coming to Dubai?


This graph shows that the highest number of flights arriving to Dubai is from London.

2. In what days in March most of the flights come to Dubai?


This graph shows that the total number of flights coming to Dubai in March areon March 16 and March 30

3. How does the number of flights that are coming from London to Dubai vary during the month? Is London always having the highest number of flights going to Dubai during the month?

From these two visuals and the previous one, it shows that most of the flights arrive in the following days: 8, 15, 16, 27, 30.

The data from these visualizations is  filtered. Now, the analysis is applied on flights going to Dubai from London and mainly in the mentioned days.

4. At what time of the day, the airport is the busiest because of the number of flights?

This graph shows that flights coming from London on the busy days are mainly arriving at night (around 4 am and at 10 pm).

5. Does the airline type affect that?

This visualization shows that the airlines that have the highest trips to Dubai are : Alitalia, Etihad Airways, and Multiple Airlines.

This graph shows the variation of total number of flight trips using Alitalia, Etihad Airways, and Multiple Airlines through out the day. 


One of the clear things is that the airlines with high number of trips to Dubai (Alitalia, Etihad Airways, and Multiple Airlines) arrive at the same time to DXB(around 4 am and at 10 pm). The solution is that these airlines have different arrival times to DXB during the day. This can be applied depending on the airlines schedules to find the alternative timings that are good for all airlines. In a later stage, more  analysis can be done to different airlines, airports and timings for more results.


About Author

Fatima Hamdan

Fatima got her bachelor's degree in Computer Engineering from Lebanese American University. She was chosen as one of the 24 women in engineering change makers from all over the world to attend the Women in Engineering conference in...
View all posts by Fatima Hamdan >

Related Articles

Leave a Comment

No comments found.

View Posts by Categories

Our Recent Popular Posts

View Posts by Tags

#python #trainwithnycdsa 2019 airbnb Alex Baransky alumni Alumni Interview Alumni Reviews Alumni Spotlight alumni story Alumnus API Application artist aws beautiful soup Best Bootcamp Best Data Science 2019 Best Data Science Bootcamp Best Data Science Bootcamp 2020 Best Ranked Big Data Book Launch Book-Signing bootcamp Bootcamp Alumni Bootcamp Prep Bundles California Cancer Research capstone Career Career Day citibike clustering Coding Course Demo Course Report D3.js data Data Analyst data science Data Science Academy Data Science Bootcamp Data science jobs Data Science Reviews Data Scientist Data Scientist Jobs data visualization Deep Learning Demo Day Discount dplyr employer networking feature engineering Finance Financial Data Science Flask gbm Get Hired ggplot2 googleVis Hadoop higgs boson Hiring hiring partner events Hiring Partners Industry Experts Instructor Blog Instructor Interview Job Job Placement Jobs Jon Krohn JP Morgan Chase Kaggle Kickstarter lasso regression Lead Data Scienctist Lead Data Scientist leaflet linear regression Logistic Regression machine learning Maps matplotlib Medical Research Meet the team meetup Networking neural network Neural networks New Courses nlp NYC NYC Data Science nyc data science academy NYC Open Data NYCDSA NYCDSA Alumni Online Online Bootcamp Online Training Open Data painter pandas Part-time Portfolio Development prediction Prework Programming PwC python Python Data Analysis python machine learning python scrapy python web scraping python webscraping Python Workshop R R Data Analysis R language R Programming R Shiny r studio R Visualization R Workshop R-bloggers random forest Ranking recommendation recommendation system regression Remote remote data science bootcamp Scrapy scrapy visualization seaborn Selenium sentiment analysis Shiny Shiny Dashboard Spark Special Special Summer Sports statistics streaming Student Interview Student Showcase SVM Switchup Tableau team TensorFlow Testimonial tf-idf Top Data Science Bootcamp twitter visualization web scraping Weekend Course What to expect word cloud word2vec XGBoost yelp