使用可解释的基于树的机器学习模型的孟加拉国登革热早期预警系统和疫情预测工具

IF 2.1 Q2 MEDICINE, GENERAL & INTERNAL
Md. Siddikur Rahman, Miftahuzzannat Amrin, Md. Abu Bokkor Shiddik
{"title":"使用可解释的基于树的机器学习模型的孟加拉国登革热早期预警系统和疫情预测工具","authors":"Md. Siddikur Rahman,&nbsp;Miftahuzzannat Amrin,&nbsp;Md. Abu Bokkor Shiddik","doi":"10.1002/hsr2.70726","DOIUrl":null,"url":null,"abstract":"<div>\n \n \n <section>\n \n <h3> Background and Aims</h3>\n \n <p>A life-threatening vector-borne disease, dengue fever (DF), poses significant global public health and economic threats, including Bangladesh. Determining dengue risk factors is crucial for early warning systems to forecast disease epidemics and develop efficient control strategies. To address this, we propose an interpretable tree-based machine learning (ML) model for dengue early warning systems and outbreak prediction in Bangladesh based on climatic, sociodemographic, and landscape factors.</p>\n </section>\n \n <section>\n \n <h3> Methods</h3>\n \n <p>A framework for forecasting DF risk was developed by using high-performance ML algorithms, namely Random Forests, eXtreme Gradient Boosting (XGBoost), and Light Gradient Boosting Machine (LightGBM), based on sociodemographic, climate, landscape, and dengue surveillance epidemiological data (January 2000 to December 2021). The optimal tree-based ML model with strong interpretability was created by comparing various ML models using the hyperparameter optimization technique. The feature importance ranking and the most significant dengue driver were found using the SHapley Additive explanation (SHAP) value.</p>\n </section>\n \n <section>\n \n <h3> Results</h3>\n \n <p>Our study findings detected a nonlinear effect of climatic parameters on dengue at different thresholds such as mean (27°C), minimum (22°C), maximum temperatures (32°C), and relative humidity (82%). The optimal minimum and maximum temperatures, humidity, rainfall, and wind speed for dengue risk are 25−28°C, 32−34°C, 75%−85%, 10 mm, and 12 m/s, respectively. The LightGBM model accurately forecasts DF and agricultural land, population density, and minimum temperature significantly affecting the dengue outbreak in Bangladesh.</p>\n </section>\n \n <section>\n \n <h3> Conclusion</h3>\n \n <p>Our proposed ML model functions as an early warning system, improving comprehension of the factors that precipitate dengue outbreaks and providing a framework for sophisticated analytical techniques in public health.</p>\n </section>\n </div>","PeriodicalId":36518,"journal":{"name":"Health Science Reports","volume":"8 5","pages":""},"PeriodicalIF":2.1000,"publicationDate":"2025-05-09","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://onlinelibrary.wiley.com/doi/epdf/10.1002/hsr2.70726","citationCount":"0","resultStr":"{\"title\":\"Dengue Early Warning System and Outbreak Prediction Tool in Bangladesh Using Interpretable Tree-Based Machine Learning Model\",\"authors\":\"Md. Siddikur Rahman,&nbsp;Miftahuzzannat Amrin,&nbsp;Md. Abu Bokkor Shiddik\",\"doi\":\"10.1002/hsr2.70726\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"<div>\\n \\n \\n <section>\\n \\n <h3> Background and Aims</h3>\\n \\n <p>A life-threatening vector-borne disease, dengue fever (DF), poses significant global public health and economic threats, including Bangladesh. Determining dengue risk factors is crucial for early warning systems to forecast disease epidemics and develop efficient control strategies. To address this, we propose an interpretable tree-based machine learning (ML) model for dengue early warning systems and outbreak prediction in Bangladesh based on climatic, sociodemographic, and landscape factors.</p>\\n </section>\\n \\n <section>\\n \\n <h3> Methods</h3>\\n \\n <p>A framework for forecasting DF risk was developed by using high-performance ML algorithms, namely Random Forests, eXtreme Gradient Boosting (XGBoost), and Light Gradient Boosting Machine (LightGBM), based on sociodemographic, climate, landscape, and dengue surveillance epidemiological data (January 2000 to December 2021). The optimal tree-based ML model with strong interpretability was created by comparing various ML models using the hyperparameter optimization technique. The feature importance ranking and the most significant dengue driver were found using the SHapley Additive explanation (SHAP) value.</p>\\n </section>\\n \\n <section>\\n \\n <h3> Results</h3>\\n \\n <p>Our study findings detected a nonlinear effect of climatic parameters on dengue at different thresholds such as mean (27°C), minimum (22°C), maximum temperatures (32°C), and relative humidity (82%). The optimal minimum and maximum temperatures, humidity, rainfall, and wind speed for dengue risk are 25−28°C, 32−34°C, 75%−85%, 10 mm, and 12 m/s, respectively. The LightGBM model accurately forecasts DF and agricultural land, population density, and minimum temperature significantly affecting the dengue outbreak in Bangladesh.</p>\\n </section>\\n \\n <section>\\n \\n <h3> Conclusion</h3>\\n \\n <p>Our proposed ML model functions as an early warning system, improving comprehension of the factors that precipitate dengue outbreaks and providing a framework for sophisticated analytical techniques in public health.</p>\\n </section>\\n </div>\",\"PeriodicalId\":36518,\"journal\":{\"name\":\"Health Science Reports\",\"volume\":\"8 5\",\"pages\":\"\"},\"PeriodicalIF\":2.1000,\"publicationDate\":\"2025-05-09\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"https://onlinelibrary.wiley.com/doi/epdf/10.1002/hsr2.70726\",\"citationCount\":\"0\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"Health Science Reports\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://onlinelibrary.wiley.com/doi/10.1002/hsr2.70726\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"Q2\",\"JCRName\":\"MEDICINE, GENERAL & INTERNAL\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"Health Science Reports","FirstCategoryId":"1085","ListUrlMain":"https://onlinelibrary.wiley.com/doi/10.1002/hsr2.70726","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q2","JCRName":"MEDICINE, GENERAL & INTERNAL","Score":null,"Total":0}
引用次数: 0

摘要

背景和目的登革热是一种威胁生命的病媒传播疾病,对包括孟加拉国在内的全球公共卫生和经济构成重大威胁。确定登革热风险因素对于预警系统预测疾病流行和制定有效控制战略至关重要。为了解决这个问题,我们提出了一个可解释的基于树的机器学习(ML)模型,用于孟加拉国登革热早期预警系统和基于气候、社会人口和景观因素的疫情预测。方法基于2000年1月至2021年12月的社会人口、气候、景观和登革热监测流行病学数据,采用随机森林、极端梯度增强(XGBoost)和光梯度增强机(LightGBM)等高性能机器学习算法,构建登革热风险预测框架。利用超参数优化技术,通过对各种机器学习模型的比较,建立了具有较强可解释性的最优树型机器学习模型。利用SHapley加性解释(SHAP)值确定特征重要性排序和最显著登革热驱动因子。我们的研究发现,在不同阈值下,气候参数对登革热有非线性影响,如平均温度(27°C)、最低温度(22°C)、最高温度(32°C)和相对湿度(82%)。登革热风险的最佳最小和最大温度、湿度、降雨量和风速分别为25 ~ 28℃、32 ~ 34℃、75% ~ 85%、10 mm和12 m/s。LightGBM模型准确地预测了对孟加拉国登革热疫情有重大影响的DF和农业用地、人口密度和最低温度。我们提出的ML模型作为一个早期预警系统,提高了对促成登革热暴发的因素的理解,并为公共卫生领域的复杂分析技术提供了框架。
本文章由计算机程序翻译,如有差异,请以英文原文为准。

Dengue Early Warning System and Outbreak Prediction Tool in Bangladesh Using Interpretable Tree-Based Machine Learning Model

Dengue Early Warning System and Outbreak Prediction Tool in Bangladesh Using Interpretable Tree-Based Machine Learning Model

Background and Aims

A life-threatening vector-borne disease, dengue fever (DF), poses significant global public health and economic threats, including Bangladesh. Determining dengue risk factors is crucial for early warning systems to forecast disease epidemics and develop efficient control strategies. To address this, we propose an interpretable tree-based machine learning (ML) model for dengue early warning systems and outbreak prediction in Bangladesh based on climatic, sociodemographic, and landscape factors.

Methods

A framework for forecasting DF risk was developed by using high-performance ML algorithms, namely Random Forests, eXtreme Gradient Boosting (XGBoost), and Light Gradient Boosting Machine (LightGBM), based on sociodemographic, climate, landscape, and dengue surveillance epidemiological data (January 2000 to December 2021). The optimal tree-based ML model with strong interpretability was created by comparing various ML models using the hyperparameter optimization technique. The feature importance ranking and the most significant dengue driver were found using the SHapley Additive explanation (SHAP) value.

Results

Our study findings detected a nonlinear effect of climatic parameters on dengue at different thresholds such as mean (27°C), minimum (22°C), maximum temperatures (32°C), and relative humidity (82%). The optimal minimum and maximum temperatures, humidity, rainfall, and wind speed for dengue risk are 25−28°C, 32−34°C, 75%−85%, 10 mm, and 12 m/s, respectively. The LightGBM model accurately forecasts DF and agricultural land, population density, and minimum temperature significantly affecting the dengue outbreak in Bangladesh.

Conclusion

Our proposed ML model functions as an early warning system, improving comprehension of the factors that precipitate dengue outbreaks and providing a framework for sophisticated analytical techniques in public health.

求助全文
通过发布文献求助,成功后即可免费获取论文全文。 去求助
来源期刊
Health Science Reports
Health Science Reports Medicine-Medicine (all)
CiteScore
1.80
自引率
0.00%
发文量
458
审稿时长
20 weeks
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
确定
请完成安全验证×
copy
已复制链接
快去分享给好友吧!
我知道了
右上角分享
点击右上角分享
0
联系我们:info@booksci.cn Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。 Copyright © 2023 布克学术 All rights reserved.
京ICP备2023020795号-1
ghs 京公网安备 11010802042870号
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术官方微信