Alexandros Angelakis, Bryan O Nyawanda, Penelope Vounatsou
{"title":"Modeling sparse Rift Valley fever incidence data: a Bayesian perspective on zero-inflated self-exciting and autoregressive models.","authors":"Alexandros Angelakis, Bryan O Nyawanda, Penelope Vounatsou","doi":"10.1186/s12879-025-11506-0","DOIUrl":null,"url":null,"abstract":"<p><strong>Background: </strong>Rift Valley fever (RVF) is a mosquito-borne zoonotic disease for which predictive modeling is often hindered by sparse data, particularly the high frequency of zero counts in both human and livestock surveillance systems. While zero-inflated models are commonly used for sparse data, several temporal count modelling frameworks exist, including less common self-exciting models that assume an initial case increases the likelihood of subsequent cases.</p><p><strong>Methods: </strong>This study compares three zero-inflated Bayesian models: the negative binomial (ZINB) with autoregressive temporal random effects, the self-exciting negative binomial (SE-NB) and the generalized autoregressive moving average negative binomial (GARMA-NB). The models were evaluated across simulated datasets with varying levels of sparsity.</p><p><strong>Results: </strong>We found that zero-inflation substantially improves predictive performance within specific sparsity thresholds: 29-94.5% (ZINB), 25-93% (SE-NB), and 30-95% (GARMA-NB). Applied to monthly RVF incidence data from northern Kenya (2018-2024), the ZINB model with a three-month rainfall lag provided the most accurate forecasts.</p><p><strong>Conclusion: </strong>These findings underscore the importance of zero-inflated negative binomial models and climate-based covariates in enhancing early warning systems for RVF-endemic regions.</p>","PeriodicalId":8981,"journal":{"name":"BMC Infectious Diseases","volume":"25 1","pages":"1221"},"PeriodicalIF":3.0000,"publicationDate":"2025-09-29","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC12481985/pdf/","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"BMC Infectious Diseases","FirstCategoryId":"3","ListUrlMain":"https://doi.org/10.1186/s12879-025-11506-0","RegionNum":3,"RegionCategory":"医学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q2","JCRName":"INFECTIOUS DISEASES","Score":null,"Total":0}
引用次数: 0
Abstract
Background: Rift Valley fever (RVF) is a mosquito-borne zoonotic disease for which predictive modeling is often hindered by sparse data, particularly the high frequency of zero counts in both human and livestock surveillance systems. While zero-inflated models are commonly used for sparse data, several temporal count modelling frameworks exist, including less common self-exciting models that assume an initial case increases the likelihood of subsequent cases.
Methods: This study compares three zero-inflated Bayesian models: the negative binomial (ZINB) with autoregressive temporal random effects, the self-exciting negative binomial (SE-NB) and the generalized autoregressive moving average negative binomial (GARMA-NB). The models were evaluated across simulated datasets with varying levels of sparsity.
Results: We found that zero-inflation substantially improves predictive performance within specific sparsity thresholds: 29-94.5% (ZINB), 25-93% (SE-NB), and 30-95% (GARMA-NB). Applied to monthly RVF incidence data from northern Kenya (2018-2024), the ZINB model with a three-month rainfall lag provided the most accurate forecasts.
Conclusion: These findings underscore the importance of zero-inflated negative binomial models and climate-based covariates in enhancing early warning systems for RVF-endemic regions.
期刊介绍:
BMC Infectious Diseases is an open access, peer-reviewed journal that considers articles on all aspects of the prevention, diagnosis and management of infectious and sexually transmitted diseases in humans, as well as related molecular genetics, pathophysiology, and epidemiology.