{"title":"Prediction of lightning events over Bangladesh: A machine learning perspective","authors":"Kumarjit Saha, Deepak S. Bisht, R. Ashrit","doi":"10.1016/j.jastp.2025.106448","DOIUrl":null,"url":null,"abstract":"<div><div>This study focuses on developing an XGBoost classifier model to predict lightning activity over Bangladesh during the pre-monsoon season using ERA5 reanalysis atmospheric data. Traditional lightning forecasts have often relied on thermodynamic indices and numerical models but have limitations due to sparse dependent variables and incomplete physical parameterizations. By leveraging machine learning (ML) techniques, this study aims to overcome these challenges by identifying and ranking atmospheric variables with the most significant influence on lightning occurrences. The model incorporates key predictors, including Mean Convective Precipitation Rate (MCPR), Total Column Cloud Ice Water (TCCIW), Total Totals Index (TTI), Convective Available Potential Energy (CAPE), High Cloud Cover (HCC), and Vertical Integral Convergence of Moisture Flux (VICMF), which play crucial roles in the charge-separation processes necessary for storm electrification. Among them, MCPR and TCCIW were identified as the most impactful, as high values of these variables correlate with increased lightning due to enhanced convective and charge-separation processes in clouds. Additional features, including CAPE, Total Totals Index (TTI), and temporal cycles, further support predictions by reflecting atmospheric instability. At a threshold of 0.6, the model achieved a Probability of Detection of 86.01%, a False Alarm Ratio of 57.82%, an Equitable Threat Score of 23.84%, an F1 Score of 57%, an AUC of 76%, an accuracy of 71.08%, surpassing models with broader variable inclusion. This feature-targeted approach, guided by SHAP analysis, highlights the importance of select variables for lightning forecasting accuracy. ML approaches have shown promise for handling large, complex datasets in atmospheric science and outperforming traditional methods. This study is the first to apply an ML model specifically for pre-monsoon lightning prediction in Bangladesh, providing a targeted, data-driven approach to improve forecasting accuracy for this region. Future efforts will explore hybrid models integrating machine learning with dynamic methods to improve real-time predictive power for lightning in Bangladesh.</div></div>","PeriodicalId":15096,"journal":{"name":"Journal of Atmospheric and Solar-Terrestrial Physics","volume":"268 ","pages":"Article 106448"},"PeriodicalIF":1.8000,"publicationDate":"2025-01-24","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Journal of Atmospheric and Solar-Terrestrial Physics","FirstCategoryId":"89","ListUrlMain":"https://www.sciencedirect.com/science/article/pii/S136468262500032X","RegionNum":4,"RegionCategory":"地球科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q3","JCRName":"GEOCHEMISTRY & GEOPHYSICS","Score":null,"Total":0}
引用次数: 0
Abstract
This study focuses on developing an XGBoost classifier model to predict lightning activity over Bangladesh during the pre-monsoon season using ERA5 reanalysis atmospheric data. Traditional lightning forecasts have often relied on thermodynamic indices and numerical models but have limitations due to sparse dependent variables and incomplete physical parameterizations. By leveraging machine learning (ML) techniques, this study aims to overcome these challenges by identifying and ranking atmospheric variables with the most significant influence on lightning occurrences. The model incorporates key predictors, including Mean Convective Precipitation Rate (MCPR), Total Column Cloud Ice Water (TCCIW), Total Totals Index (TTI), Convective Available Potential Energy (CAPE), High Cloud Cover (HCC), and Vertical Integral Convergence of Moisture Flux (VICMF), which play crucial roles in the charge-separation processes necessary for storm electrification. Among them, MCPR and TCCIW were identified as the most impactful, as high values of these variables correlate with increased lightning due to enhanced convective and charge-separation processes in clouds. Additional features, including CAPE, Total Totals Index (TTI), and temporal cycles, further support predictions by reflecting atmospheric instability. At a threshold of 0.6, the model achieved a Probability of Detection of 86.01%, a False Alarm Ratio of 57.82%, an Equitable Threat Score of 23.84%, an F1 Score of 57%, an AUC of 76%, an accuracy of 71.08%, surpassing models with broader variable inclusion. This feature-targeted approach, guided by SHAP analysis, highlights the importance of select variables for lightning forecasting accuracy. ML approaches have shown promise for handling large, complex datasets in atmospheric science and outperforming traditional methods. This study is the first to apply an ML model specifically for pre-monsoon lightning prediction in Bangladesh, providing a targeted, data-driven approach to improve forecasting accuracy for this region. Future efforts will explore hybrid models integrating machine learning with dynamic methods to improve real-time predictive power for lightning in Bangladesh.
期刊介绍:
The Journal of Atmospheric and Solar-Terrestrial Physics (JASTP) is an international journal concerned with the inter-disciplinary science of the Earth''s atmospheric and space environment, especially the highly varied and highly variable physical phenomena that occur in this natural laboratory and the processes that couple them.
The journal covers the physical processes operating in the troposphere, stratosphere, mesosphere, thermosphere, ionosphere, magnetosphere, the Sun, interplanetary medium, and heliosphere. Phenomena occurring in other "spheres", solar influences on climate, and supporting laboratory measurements are also considered. The journal deals especially with the coupling between the different regions.
Solar flares, coronal mass ejections, and other energetic events on the Sun create interesting and important perturbations in the near-Earth space environment. The physics of such "space weather" is central to the Journal of Atmospheric and Solar-Terrestrial Physics and the journal welcomes papers that lead in the direction of a predictive understanding of the coupled system. Regarding the upper atmosphere, the subjects of aeronomy, geomagnetism and geoelectricity, auroral phenomena, radio wave propagation, and plasma instabilities, are examples within the broad field of solar-terrestrial physics which emphasise the energy exchange between the solar wind, the magnetospheric and ionospheric plasmas, and the neutral gas. In the lower atmosphere, topics covered range from mesoscale to global scale dynamics, to atmospheric electricity, lightning and its effects, and to anthropogenic changes.