Thunderstorm Prediction Model Using SMOTE Sampling and Machine Learning Approach

Shirley Anak Rufus, N. A. Ahmad, Z. Abdul-Malek, N. Abdullah
{"title":"Thunderstorm Prediction Model Using SMOTE Sampling and Machine Learning Approach","authors":"Shirley Anak Rufus, N. A. Ahmad, Z. Abdul-Malek, N. Abdullah","doi":"10.1109/APL57308.2023.10182046","DOIUrl":null,"url":null,"abstract":"Thunderstorms are one of the most destructive phenomena worldwide and are primarily associated with lightning and heavy rain that cause human fatalities, urban floods, and crop damage. Therefore, predicting thunderstorms with reasonable accuracy is one of the crucial requirements for the planning and management of many applications, including agriculture, flood control, and air traffic control. This study extensively applied the historical lightning and meteorological data from 2011 to 2018 of the southern regions of Peninsular Malaysia to predict thunderstorm occurrence. Positive CG lightning rarely occurs compared to negative CG lightning and also due to the non-linear and complex characteristics of the thunderstorm and lightning itself, leading to an imbalance in the dataset. The resampling technique called SMOTE is introduced to overcome the imbalance of the training dataset. Then the dataset is trained and tested with five Machine Learning (ML) algorithms, including Decision Trees (DT), Adaptive Boosting (AdaBoost), Random Forest (RF), Extra Trees (ET), and Gradient Boosting (GB). The results have shown a good prediction with accuracy (74% to 95%), recall (72% to 93%), precision (76% to 97%), and F1-Score (74% to 95%) with SMOTE. The SMOTE and GB model prediction model is the best algorithm for thunderstorm prediction for this region in terms of performance metrics. In the future, the prediction results based on the lightning pattern and weather dataset will likely alert the related authorities to make an early strategy to handle the occurrence of thunderstorms.","PeriodicalId":371726,"journal":{"name":"2023 12th Asia-Pacific International Conference on Lightning (APL)","volume":"59 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2023-06-12","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"2023 12th Asia-Pacific International Conference on Lightning (APL)","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/APL57308.2023.10182046","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 0

Abstract

Thunderstorms are one of the most destructive phenomena worldwide and are primarily associated with lightning and heavy rain that cause human fatalities, urban floods, and crop damage. Therefore, predicting thunderstorms with reasonable accuracy is one of the crucial requirements for the planning and management of many applications, including agriculture, flood control, and air traffic control. This study extensively applied the historical lightning and meteorological data from 2011 to 2018 of the southern regions of Peninsular Malaysia to predict thunderstorm occurrence. Positive CG lightning rarely occurs compared to negative CG lightning and also due to the non-linear and complex characteristics of the thunderstorm and lightning itself, leading to an imbalance in the dataset. The resampling technique called SMOTE is introduced to overcome the imbalance of the training dataset. Then the dataset is trained and tested with five Machine Learning (ML) algorithms, including Decision Trees (DT), Adaptive Boosting (AdaBoost), Random Forest (RF), Extra Trees (ET), and Gradient Boosting (GB). The results have shown a good prediction with accuracy (74% to 95%), recall (72% to 93%), precision (76% to 97%), and F1-Score (74% to 95%) with SMOTE. The SMOTE and GB model prediction model is the best algorithm for thunderstorm prediction for this region in terms of performance metrics. In the future, the prediction results based on the lightning pattern and weather dataset will likely alert the related authorities to make an early strategy to handle the occurrence of thunderstorms.
基于SMOTE采样和机器学习方法的雷暴预测模型
雷暴是世界上最具破坏性的现象之一,主要与闪电和大雨有关,导致人类死亡,城市洪水和农作物受损。因此,合理准确地预测雷暴是许多应用的规划和管理的关键要求之一,包括农业、防洪和空中交通管制。本研究广泛应用马来西亚半岛南部地区2011 - 2018年雷电历史数据和气象数据对雷暴发生进行预测。与负CG闪电相比,正CG闪电很少发生,而且由于雷暴和闪电本身的非线性和复杂特征,导致数据集的不平衡。为了克服训练数据的不平衡性,引入了SMOTE重采样技术。然后使用五种机器学习(ML)算法对数据集进行训练和测试,包括决策树(DT)、自适应增强(AdaBoost)、随机森林(RF)、额外树(ET)和梯度增强(GB)。结果表明,SMOTE具有较好的预测准确度(74% ~ 95%)、召回率(72% ~ 93%)、精密度(76% ~ 97%)和F1-Score(74% ~ 95%)。SMOTE和GB模式预测模型在性能指标方面是该地区雷暴预报的最佳算法。未来,基于闪电模式和天气数据的预测结果可能会提醒有关当局及早制定应对雷暴发生的策略。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
求助全文
约1分钟内获得全文 求助全文
来源期刊
自引率
0.00%
发文量
0
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
确定
请完成安全验证×
copy
已复制链接
快去分享给好友吧!
我知道了
右上角分享
点击右上角分享
0
联系我们:info@booksci.cn Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。 Copyright © 2023 布克学术 All rights reserved.
京ICP备2023020795号-1
ghs 京公网安备 11010802042870号
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术官方微信