Integrating machine learning and spatial clustering for malaria case prediction in Brazil's Legal Amazon.

IF 3.4 3区 医学 Q2 INFECTIOUS DISEASES
Kayo Henrique de Carvalho Monteiro, Élisson da Silva Rocha, Luis Augusto Morais, Elton Gino Santos, Sebastião Rogerio da S Neto, Vanderson Sampaio, Patricia Takako Endo
{"title":"Integrating machine learning and spatial clustering for malaria case prediction in Brazil's Legal Amazon.","authors":"Kayo Henrique de Carvalho Monteiro, Élisson da Silva Rocha, Luis Augusto Morais, Elton Gino Santos, Sebastião Rogerio da S Neto, Vanderson Sampaio, Patricia Takako Endo","doi":"10.1186/s12879-025-11193-x","DOIUrl":null,"url":null,"abstract":"<p><p>Malaria remains a major global health challenge, particularly in Brazil's Legal Amazon region, where environmental and socioeconomic conditions foster favorable conditions for disease transmission. Traditional control measures have shown limited effectiveness, emphasizing the need for better predictive approaches to support timely and targeted public health interventions. This study evaluates the performance of six computational models-Long Short-Term Memory (LSTM), Gated Recurrent Units (GRU), Support Vector Regression (SVR), Random Forest (RF), eXtreme Gradient Boosting (XGBoost), and Autoregressive Integrated Moving Average (ARIMA)-for forecasting weekly malaria cases across multiple states in the Legal Amazon. The results demonstrate that the RF model consistently outperformed the other models, achieving the lowest Root Mean Squared Error (RMSE) and Mean Absolute Error (MAE) values in most cases, such as in cluster 02 of the state of Acre, with RMSE of 0.00203 and MAE of 0.00133. The integration of K-means clustering further improved the model predictive accuracy by accounting for spatial heterogeneity and capturing localized transmission dynamics. This hybrid modeling approach, combining machine learning models with spatial clustering, offers a promising tool for enhancing malaria surveillance and guiding more effective public health strategies, especially for malaria control efforts in high-risk regions.</p>","PeriodicalId":8981,"journal":{"name":"BMC Infectious Diseases","volume":"25 1","pages":"802"},"PeriodicalIF":3.4000,"publicationDate":"2025-06-08","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC12147289/pdf/","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"BMC Infectious Diseases","FirstCategoryId":"3","ListUrlMain":"https://doi.org/10.1186/s12879-025-11193-x","RegionNum":3,"RegionCategory":"医学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q2","JCRName":"INFECTIOUS DISEASES","Score":null,"Total":0}
引用次数: 0

Abstract

Malaria remains a major global health challenge, particularly in Brazil's Legal Amazon region, where environmental and socioeconomic conditions foster favorable conditions for disease transmission. Traditional control measures have shown limited effectiveness, emphasizing the need for better predictive approaches to support timely and targeted public health interventions. This study evaluates the performance of six computational models-Long Short-Term Memory (LSTM), Gated Recurrent Units (GRU), Support Vector Regression (SVR), Random Forest (RF), eXtreme Gradient Boosting (XGBoost), and Autoregressive Integrated Moving Average (ARIMA)-for forecasting weekly malaria cases across multiple states in the Legal Amazon. The results demonstrate that the RF model consistently outperformed the other models, achieving the lowest Root Mean Squared Error (RMSE) and Mean Absolute Error (MAE) values in most cases, such as in cluster 02 of the state of Acre, with RMSE of 0.00203 and MAE of 0.00133. The integration of K-means clustering further improved the model predictive accuracy by accounting for spatial heterogeneity and capturing localized transmission dynamics. This hybrid modeling approach, combining machine learning models with spatial clustering, offers a promising tool for enhancing malaria surveillance and guiding more effective public health strategies, especially for malaria control efforts in high-risk regions.

整合机器学习和空间聚类在巴西合法亚马逊地区的疟疾病例预测。
疟疾仍然是一个重大的全球卫生挑战,特别是在巴西的合法亚马逊地区,那里的环境和社会经济条件为疾病传播创造了有利条件。传统的控制措施已显示出有限的效果,强调需要更好的预测方法,以支持及时和有针对性的公共卫生干预措施。本研究评估了长短期记忆(LSTM)、门控循环单元(GRU)、支持向量回归(SVR)、随机森林(RF)、极端梯度增强(XGBoost)和自回归综合移动平均(ARIMA)这六种计算模型的性能,用于预测亚马逊地区多个州的每周疟疾病例。结果表明,RF模型始终优于其他模型,在大多数情况下获得最低的均方根误差(RMSE)和平均绝对误差(MAE)值,例如在Acre州的集群02中,RMSE为0.00203,MAE为0.00133。K-means聚类的集成通过考虑空间异质性和捕捉局部传输动态进一步提高了模型的预测精度。这种混合建模方法将机器学习模型与空间聚类相结合,为加强疟疾监测和指导更有效的公共卫生战略,特别是为高风险地区的疟疾控制工作提供了一种有前途的工具。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
求助全文
约1分钟内获得全文 求助全文
来源期刊
BMC Infectious Diseases
BMC Infectious Diseases 医学-传染病学
CiteScore
6.50
自引率
0.00%
发文量
860
审稿时长
3.3 months
期刊介绍: BMC Infectious Diseases is an open access, peer-reviewed journal that considers articles on all aspects of the prevention, diagnosis and management of infectious and sexually transmitted diseases in humans, as well as related molecular genetics, pathophysiology, and epidemiology.
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
确定
请完成安全验证×
copy
已复制链接
快去分享给好友吧!
我知道了
右上角分享
点击右上角分享
0
联系我们:info@booksci.cn Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。 Copyright © 2023 布克学术 All rights reserved.
京ICP备2023020795号-1
ghs 京公网安备 11010802042870号
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术官方微信