Flight Delay Prediction for Mitigation of Airport Commercial Revenue Losses Using Machine Learning on Imbalanced Dataset

Rae Arun Sugara, D. Purwitasari
{"title":"Flight Delay Prediction for Mitigation of Airport Commercial Revenue Losses Using Machine Learning on Imbalanced Dataset","authors":"Rae Arun Sugara, D. Purwitasari","doi":"10.1109/CENIM56801.2022.10037369","DOIUrl":null,"url":null,"abstract":"Flight delay is one of the factors that affect the decline in customer satisfaction and airport revenue. In addition to influencing customer perceptions of airport services, flight delay also has an impact on decreasing airport revenue and operation. This study models a flight delay prediction, and the process is carried out using Decision Tree, Random Forest, Gradient Boosted Tree, and XGBoost Tree algorithms. This study has also used and merged the weather characteristic data as secondary data to the airport operational flight data. To anticipate the imbalanced class, several sampling techniques were applied. Synthetic Minority Over Sampling Technique (SMOTE), Random Over-Sampling (ROS), Random Under-Sampling (RUS), and combining ROS with RUS are being used. The result of processing the analysis is in the form of a model to predict the category of flight delay. The model has been evaluated by using the Confusion Matrix and Area Under ROC Curve (AUC) value. The result of this study shows the Random Forest classifier with the combination of ROS + RUS technique and data split ratio of 90:10 gave the highest accuracy, error rate, and AUC value as shown as 82.58%, 17.42%, and 81.1% respectively on data testing. The result of the flight delay prediction model is expected to be a strategic recommendation for determining airport policies in the future. By implementing the best strategy related to the airport operation, it could carry out commercial planning in order to optimize airport commercial revenue.","PeriodicalId":118934,"journal":{"name":"2022 International Conference on Computer Engineering, Network, and Intelligent Multimedia (CENIM)","volume":"11 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2022-11-22","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"2022 International Conference on Computer Engineering, Network, and Intelligent Multimedia (CENIM)","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/CENIM56801.2022.10037369","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 0

Abstract

Flight delay is one of the factors that affect the decline in customer satisfaction and airport revenue. In addition to influencing customer perceptions of airport services, flight delay also has an impact on decreasing airport revenue and operation. This study models a flight delay prediction, and the process is carried out using Decision Tree, Random Forest, Gradient Boosted Tree, and XGBoost Tree algorithms. This study has also used and merged the weather characteristic data as secondary data to the airport operational flight data. To anticipate the imbalanced class, several sampling techniques were applied. Synthetic Minority Over Sampling Technique (SMOTE), Random Over-Sampling (ROS), Random Under-Sampling (RUS), and combining ROS with RUS are being used. The result of processing the analysis is in the form of a model to predict the category of flight delay. The model has been evaluated by using the Confusion Matrix and Area Under ROC Curve (AUC) value. The result of this study shows the Random Forest classifier with the combination of ROS + RUS technique and data split ratio of 90:10 gave the highest accuracy, error rate, and AUC value as shown as 82.58%, 17.42%, and 81.1% respectively on data testing. The result of the flight delay prediction model is expected to be a strategic recommendation for determining airport policies in the future. By implementing the best strategy related to the airport operation, it could carry out commercial planning in order to optimize airport commercial revenue.
基于不平衡数据集机器学习的航班延误预测缓解机场商业收入损失
航班延误是影响旅客满意度和机场收入下降的因素之一。航班延误除了影响顾客对机场服务的看法外,还会减少机场的收入和运营。本研究建立了一个航班延误预测模型,并使用决策树、随机森林、梯度增强树和XGBoost树算法进行了预测。本研究还将天气特征数据作为辅助数据合并到机场业务飞行数据中。为了预测不平衡类,应用了几种抽样技术。合成少数过采样技术(SMOTE)、随机过采样技术(ROS)、随机欠采样技术(RUS)以及ROS与RUS相结合的技术被广泛应用。对分析结果进行处理后,以模型的形式对航班延误类别进行预测。利用混淆矩阵和ROC曲线下面积(Area Under ROC Curve, AUC)值对模型进行评价。本研究结果表明,采用ROS + RUS技术、数据分割比为90:10的Random Forest分类器在数据测试中准确率最高,错误率为82.58%,AUC值为17.42%,AUC值为81.1%。航班延误预测模型的结果有望成为决定未来机场政策的战略建议。通过实施与机场运营相关的最佳策略,进行商业规划,优化机场商业收益。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
求助全文
约1分钟内获得全文 求助全文
来源期刊
自引率
0.00%
发文量
0
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
确定
请完成安全验证×
copy
已复制链接
快去分享给好友吧!
我知道了
右上角分享
点击右上角分享
0
联系我们:info@booksci.cn Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。 Copyright © 2023 布克学术 All rights reserved.
京ICP备2023020795号-1
ghs 京公网安备 11010802042870号
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术官方微信