2016
DOI: 10.1145/2888402
|View full text |Cite
|
Sign up to set email alerts
|

Using Scalable Data Mining for Predicting Flight Delays

Abstract: Flight delays are frequent all over the world (about 20% of airline flights arrive more than 15min late) and they are estimated to have an annual cost of billions of dollars. This scenario makes the prediction of flight delays a primary issue for airlines and travelers. The main goal of this work is to implement a predictor of the arrival delay of a scheduled flight due to weather conditions. The predicted arrival delay takes into consideration both flight information (origin airport, destination airport, sche… Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
2
1
1
1

Citation Types

0
45
0
1

Year Published

2018
2018
2021
2021

Publication Types

Select...
5
3

Relationship

2
6

Authors

Journals

citations
Cited by 74 publications
(46 citation statements)
references
References 10 publications
0
45
0
1
Order By: Relevance
“…As a result, the proposed Random Forest, which is an ensemble learning method, predicted 80.36% of arrival delay. In [8], Belcastro et al predicted the arrival flight delay due to bad weather by Random Forest in MapReduce. This study, which used weather data of NOAA, showed the result of prediction as follows: (1) with a delay threshold of 15 min, an accuracy of 74.2% and 71.8% recall on delayed flights are achieved, and (2) with a threshold of 60 min, the accuracy is 85.8% and the delay recall is 86.9%.…”
Section: Literature Reviewmentioning
confidence: 99%
“…As a result, the proposed Random Forest, which is an ensemble learning method, predicted 80.36% of arrival delay. In [8], Belcastro et al predicted the arrival flight delay due to bad weather by Random Forest in MapReduce. This study, which used weather data of NOAA, showed the result of prediction as follows: (1) with a delay threshold of 15 min, an accuracy of 74.2% and 71.8% recall on delayed flights are achieved, and (2) with a threshold of 60 min, the accuracy is 85.8% and the delay recall is 86.9%.…”
Section: Literature Reviewmentioning
confidence: 99%
“…The hybrid approach was compared by the most widely used models LR [9], [13] and DT [14], [30], [31], [11], [28] . In addition, it was compared with a benchmark from Kaggle.com to prove its authenticity.…”
Section: Methodsmentioning
confidence: 99%
“…The advanced machine learning techniques and associated data mining tools can help to understand and predict several complex phenomena, the approach is used in enabling businesses and research collaborations alike to make informed decisions flight delay prediction [9]. In addition, every year approximately 20% of airline flights are delayed or canceled mainly due to bad weather, carrier equipment, security or technical airport problems [9], [2]. These delays result in significant cost to both airlines and passengers [9], [2], [1].…”
Section: Introductionmentioning
confidence: 99%
See 1 more Smart Citation
“…To get accurate prediction models and correctly evaluate them, we used balanced training sets and test sets in which half of the log events end with the purchase of a ticket and half with the abandonment of the platform. For this reason, we used the random under-sampling algorithm [25], which balances class distribution through random discarding of major class tuples as described in [26].…”
Section: Step 4: Prediction Modelmentioning
confidence: 99%