{"title":"NWP-based lightning prediction using flexible count data regression","authors":"T. Simon, G. Mayr, Nikolaus Umlauf, A. Zeileis","doi":"10.5194/ASCMO-5-1-2019","DOIUrl":null,"url":null,"abstract":"Abstract. A method to predict lightning by postprocessing numerical weather prediction\n(NWP) output is developed for the region of the European Eastern Alps.\nCloud-to-ground (CG) flashes – detected by the ground-based Austrian\nLightning Detection & Information System (ALDIS) network – are counted on\nthe 18×18 km2 grid of the 51-member NWP ensemble of the European\nCentre for Medium-Range Weather Forecasts (ECMWF). These counts serve as the\ntarget quantity in count data regression models for the occurrence of\nlightning events and flash counts of CG. The probability of lightning\noccurrence is modelled by a Bernoulli distribution. The flash counts are\nmodelled with a hurdle approach where the Bernoulli distribution is combined\nwith a zero-truncated negative binomial. In the statistical models the\nparameters of the distributions are described by additive predictors, which\nare assembled using potentially nonlinear functions of NWP covariates.\nMeasures of location and spread of 100 direct and derived NWP covariates\nprovide a pool of candidates for the nonlinear terms. A combination of\nstability selection and gradient boosting identifies the nine (three) most\ninfluential terms for the parameters of the Bernoulli (zero-truncated\nnegative binomial) distribution, most of which turn out to be associated with\neither convective available potential energy (CAPE) or convective\nprecipitation. Markov chain Monte Carlo (MCMC) sampling estimates the final\nmodel to provide credible inference of effects, scores, and\npredictions. The selection of terms and MCMC sampling are applied for data of\nthe year 2016, and out-of-sample performance is evaluated for 2017. The\noccurrence model outperforms a reference climatology – based on 7 years of\ndata – up to a forecast horizon of 5 days. The flash count model is\ncalibrated and also outperforms climatology for exceedance probabilities,\nquantiles, and full predictive distributions.\n","PeriodicalId":36792,"journal":{"name":"Advances in Statistical Climatology, Meteorology and Oceanography","volume":" ","pages":""},"PeriodicalIF":0.0000,"publicationDate":"2019-02-04","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"14","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Advances in Statistical Climatology, Meteorology and Oceanography","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.5194/ASCMO-5-1-2019","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q1","JCRName":"Mathematics","Score":null,"Total":0}
引用次数: 14
Abstract
Abstract. A method to predict lightning by postprocessing numerical weather prediction
(NWP) output is developed for the region of the European Eastern Alps.
Cloud-to-ground (CG) flashes – detected by the ground-based Austrian
Lightning Detection & Information System (ALDIS) network – are counted on
the 18×18 km2 grid of the 51-member NWP ensemble of the European
Centre for Medium-Range Weather Forecasts (ECMWF). These counts serve as the
target quantity in count data regression models for the occurrence of
lightning events and flash counts of CG. The probability of lightning
occurrence is modelled by a Bernoulli distribution. The flash counts are
modelled with a hurdle approach where the Bernoulli distribution is combined
with a zero-truncated negative binomial. In the statistical models the
parameters of the distributions are described by additive predictors, which
are assembled using potentially nonlinear functions of NWP covariates.
Measures of location and spread of 100 direct and derived NWP covariates
provide a pool of candidates for the nonlinear terms. A combination of
stability selection and gradient boosting identifies the nine (three) most
influential terms for the parameters of the Bernoulli (zero-truncated
negative binomial) distribution, most of which turn out to be associated with
either convective available potential energy (CAPE) or convective
precipitation. Markov chain Monte Carlo (MCMC) sampling estimates the final
model to provide credible inference of effects, scores, and
predictions. The selection of terms and MCMC sampling are applied for data of
the year 2016, and out-of-sample performance is evaluated for 2017. The
occurrence model outperforms a reference climatology – based on 7 years of
data – up to a forecast horizon of 5 days. The flash count model is
calibrated and also outperforms climatology for exceedance probabilities,
quantiles, and full predictive distributions.