利用机器学习预测巴西亚马逊河沿岸城市的结核病集群。

Luis Silva, Luise Gomes da Motta, Lynn Eberly
{"title":"利用机器学习预测巴西亚马逊河沿岸城市的结核病集群。","authors":"Luis Silva, Luise Gomes da Motta, Lynn Eberly","doi":"10.1590/1980-549720240024","DOIUrl":null,"url":null,"abstract":"<p><strong>Objective: </strong>Tuberculosis (TB) is the second most deadly infectious disease globally, posing a significant burden in Brazil and its Amazonian region. This study focused on the \"riverine municipalities\" and hypothesizes the presence of TB clusters in the area. We also aimed to train a machine learning model to differentiate municipalities classified as hot spots vs. non-hot spots using disease surveillance variables as predictors.</p><p><strong>Methods: </strong>Data regarding the incidence of TB from 2019 to 2022 in the riverine town was collected from the Brazilian Health Ministry Informatics Department. Moran's I was used to assess global spatial autocorrelation, while the Getis-Ord GI* method was employed to detect high and low-incidence clusters. A Random Forest machine-learning model was trained using surveillance variables related to TB cases to predict hot spots among non-hot spot municipalities.</p><p><strong>Results: </strong>Our analysis revealed distinct geographical clusters with high and low TB incidence following a west-to-east distribution pattern. The Random Forest Classification model utilizes six surveillance variables to predict hot vs. non-hot spots. The machine learning model achieved an Area Under the Receiver Operator Curve (AUC-ROC) of 0.81.</p><p><strong>Conclusion: </strong>Municipalities with higher percentages of recurrent cases, deaths due to TB, antibiotic regimen changes, percentage of new cases, and cases with smoking history were the best predictors of hot spots. This prediction method can be leveraged to identify the municipalities at the highest risk of being hot spots for the disease, aiding policymakers with an evidenced-based tool to direct resource allocation for disease control in the riverine municipalities.</p>","PeriodicalId":74697,"journal":{"name":"Revista brasileira de epidemiologia = Brazilian journal of epidemiology","volume":"27 ","pages":"e240024"},"PeriodicalIF":0.0000,"publicationDate":"2024-05-13","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC11093519/pdf/","citationCount":"0","resultStr":"{\"title\":\"Prediction of tuberculosis clusters in the riverine municipalities of the Brazilian Amazon with machine learning.\",\"authors\":\"Luis Silva, Luise Gomes da Motta, Lynn Eberly\",\"doi\":\"10.1590/1980-549720240024\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"<p><strong>Objective: </strong>Tuberculosis (TB) is the second most deadly infectious disease globally, posing a significant burden in Brazil and its Amazonian region. This study focused on the \\\"riverine municipalities\\\" and hypothesizes the presence of TB clusters in the area. We also aimed to train a machine learning model to differentiate municipalities classified as hot spots vs. non-hot spots using disease surveillance variables as predictors.</p><p><strong>Methods: </strong>Data regarding the incidence of TB from 2019 to 2022 in the riverine town was collected from the Brazilian Health Ministry Informatics Department. Moran's I was used to assess global spatial autocorrelation, while the Getis-Ord GI* method was employed to detect high and low-incidence clusters. A Random Forest machine-learning model was trained using surveillance variables related to TB cases to predict hot spots among non-hot spot municipalities.</p><p><strong>Results: </strong>Our analysis revealed distinct geographical clusters with high and low TB incidence following a west-to-east distribution pattern. The Random Forest Classification model utilizes six surveillance variables to predict hot vs. non-hot spots. The machine learning model achieved an Area Under the Receiver Operator Curve (AUC-ROC) of 0.81.</p><p><strong>Conclusion: </strong>Municipalities with higher percentages of recurrent cases, deaths due to TB, antibiotic regimen changes, percentage of new cases, and cases with smoking history were the best predictors of hot spots. This prediction method can be leveraged to identify the municipalities at the highest risk of being hot spots for the disease, aiding policymakers with an evidenced-based tool to direct resource allocation for disease control in the riverine municipalities.</p>\",\"PeriodicalId\":74697,\"journal\":{\"name\":\"Revista brasileira de epidemiologia = Brazilian journal of epidemiology\",\"volume\":\"27 \",\"pages\":\"e240024\"},\"PeriodicalIF\":0.0000,\"publicationDate\":\"2024-05-13\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC11093519/pdf/\",\"citationCount\":\"0\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"Revista brasileira de epidemiologia = Brazilian journal of epidemiology\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/10.1590/1980-549720240024\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"2024/1/1 0:00:00\",\"PubModel\":\"eCollection\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"Revista brasileira de epidemiologia = Brazilian journal of epidemiology","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1590/1980-549720240024","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"2024/1/1 0:00:00","PubModel":"eCollection","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 0

摘要

目的:结核病(TB)是全球第二大致命传染病,给巴西及其亚马逊地区造成了沉重负担。本研究重点关注 "沿河城市",并假设该地区存在结核病集群。我们还旨在训练一个机器学习模型,利用疾病监测变量作为预测因子,区分被归类为热点与非热点的城市:方法:我们从巴西卫生部信息部门收集了 2019 年至 2022 年沿河城镇的结核病发病率数据。Moran's I 用于评估全球空间自相关性,Getis-Ord GI* 方法用于检测高发病率和低发病率集群。利用与肺结核病例相关的监测变量训练了随机森林机器学习模型,以预测非热点城市中的热点:我们的分析显示,结核病高发和低发地区呈自西向东的分布格局。随机森林分类模型利用六个监测变量来预测热点与非热点。该机器学习模型的受体运算曲线下面积(AUC-ROC)为 0.81:复发病例、肺结核死亡病例、抗生素治疗方案变化、新发病例百分比以及有吸烟史的病例百分比较高的城市是热点地区的最佳预测因素。可以利用这种预测方法来确定哪些城市成为疾病热点的风险最高,从而为政策制定者提供一种基于证据的工具,指导沿河城市的疾病控制资源分配。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
Prediction of tuberculosis clusters in the riverine municipalities of the Brazilian Amazon with machine learning.

Objective: Tuberculosis (TB) is the second most deadly infectious disease globally, posing a significant burden in Brazil and its Amazonian region. This study focused on the "riverine municipalities" and hypothesizes the presence of TB clusters in the area. We also aimed to train a machine learning model to differentiate municipalities classified as hot spots vs. non-hot spots using disease surveillance variables as predictors.

Methods: Data regarding the incidence of TB from 2019 to 2022 in the riverine town was collected from the Brazilian Health Ministry Informatics Department. Moran's I was used to assess global spatial autocorrelation, while the Getis-Ord GI* method was employed to detect high and low-incidence clusters. A Random Forest machine-learning model was trained using surveillance variables related to TB cases to predict hot spots among non-hot spot municipalities.

Results: Our analysis revealed distinct geographical clusters with high and low TB incidence following a west-to-east distribution pattern. The Random Forest Classification model utilizes six surveillance variables to predict hot vs. non-hot spots. The machine learning model achieved an Area Under the Receiver Operator Curve (AUC-ROC) of 0.81.

Conclusion: Municipalities with higher percentages of recurrent cases, deaths due to TB, antibiotic regimen changes, percentage of new cases, and cases with smoking history were the best predictors of hot spots. This prediction method can be leveraged to identify the municipalities at the highest risk of being hot spots for the disease, aiding policymakers with an evidenced-based tool to direct resource allocation for disease control in the riverine municipalities.

求助全文
通过发布文献求助,成功后即可免费获取论文全文。 去求助
来源期刊
自引率
0.00%
发文量
0
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
确定
请完成安全验证×
copy
已复制链接
快去分享给好友吧!
我知道了
右上角分享
点击右上角分享
0
联系我们:info@booksci.cn Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。 Copyright © 2023 布克学术 All rights reserved.
京ICP备2023020795号-1
ghs 京公网安备 11010802042870号
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术官方微信