Zachary J Madewell, Dania M Rodriguez, Maile B Thayer, Vanessa Rivera-Amill, Jomil Torres Aponte, Melissa Marzan-Rodriguez, Gabriela Paz-Bailey, Laura E Adams, Joshua M Wong
{"title":"机器学习在波多黎各改善登革热诊断。","authors":"Zachary J Madewell, Dania M Rodriguez, Maile B Thayer, Vanessa Rivera-Amill, Jomil Torres Aponte, Melissa Marzan-Rodriguez, Gabriela Paz-Bailey, Laura E Adams, Joshua M Wong","doi":"10.1111/tmi.70036","DOIUrl":null,"url":null,"abstract":"<p><strong>Objectives: </strong>Diagnosing dengue accurately, especially in resource-limited settings, remains challenging due to overlapping symptoms with other febrile illnesses and limitations of current diagnostic methods. This study aimed to develop machine learning models that leverage readily available clinical data to improve diagnostic accuracy for dengue, potentially offering a more accessible and rapid diagnostic tool for healthcare providers.</p><p><strong>Methods: </strong>We used data from the Sentinel Enhanced Dengue Surveillance System in Puerto Rico (May 2012-June 2024). The Sentinel Enhanced Dengue Surveillance System primarily targets acute febrile illness but also includes cases with other symptoms during outbreaks (e.g., Zika and COVID-19). Machine learning models (logistic regression, random forest, support vector machine, artificial neural network, adaptive boosting, light gradient boosting machine [LightGBM] and extreme gradient boosting [XGBoost]) were evaluated across different feature sets, including demographic, clinical, laboratory and epidemiological variables. Model performance was assessed using the area under the receiver operating characteristic curve (AUC), where higher AUC values indicate better performance in distinguishing dengue cases from non-dengue cases.</p><p><strong>Results: </strong>Among 49,679 patients in SEDSS, 1640 laboratory-confirmed dengue cases were identified. The XGBoost and LightGBM models achieved the highest diagnostic accuracy, with AUCs exceeding 90%, particularly with comprehensive feature sets. Incorporating predictors such as monthly dengue incidence, leukopenia, thrombocytopenia, rash, age and absence of nasal discharge significantly enhanced model sensitivity and specificity for diagnosing dengue. Adding more relevant clinical and epidemiological features consistently improved the models' ability to correctly identify dengue cases.</p><p><strong>Conclusions: </strong>Machine learning models, especially XGBoost and LightGBM, show promise for improving diagnostic accuracy for dengue using widely accessible clinical data, even in resource-limited settings. Future research should focus on developing user-friendly tools, such as mobile apps, web-based platforms, or clinical decision systems integrated into electronic health records, to implement these models in clinical practice and exploring their application for predicting dengue.</p>","PeriodicalId":23962,"journal":{"name":"Tropical Medicine & International Health","volume":" ","pages":""},"PeriodicalIF":2.3000,"publicationDate":"2025-09-15","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":"{\"title\":\"Machine Learning for Improved Dengue Diagnosis in Puerto Rico.\",\"authors\":\"Zachary J Madewell, Dania M Rodriguez, Maile B Thayer, Vanessa Rivera-Amill, Jomil Torres Aponte, Melissa Marzan-Rodriguez, Gabriela Paz-Bailey, Laura E Adams, Joshua M Wong\",\"doi\":\"10.1111/tmi.70036\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"<p><strong>Objectives: </strong>Diagnosing dengue accurately, especially in resource-limited settings, remains challenging due to overlapping symptoms with other febrile illnesses and limitations of current diagnostic methods. This study aimed to develop machine learning models that leverage readily available clinical data to improve diagnostic accuracy for dengue, potentially offering a more accessible and rapid diagnostic tool for healthcare providers.</p><p><strong>Methods: </strong>We used data from the Sentinel Enhanced Dengue Surveillance System in Puerto Rico (May 2012-June 2024). The Sentinel Enhanced Dengue Surveillance System primarily targets acute febrile illness but also includes cases with other symptoms during outbreaks (e.g., Zika and COVID-19). Machine learning models (logistic regression, random forest, support vector machine, artificial neural network, adaptive boosting, light gradient boosting machine [LightGBM] and extreme gradient boosting [XGBoost]) were evaluated across different feature sets, including demographic, clinical, laboratory and epidemiological variables. Model performance was assessed using the area under the receiver operating characteristic curve (AUC), where higher AUC values indicate better performance in distinguishing dengue cases from non-dengue cases.</p><p><strong>Results: </strong>Among 49,679 patients in SEDSS, 1640 laboratory-confirmed dengue cases were identified. The XGBoost and LightGBM models achieved the highest diagnostic accuracy, with AUCs exceeding 90%, particularly with comprehensive feature sets. Incorporating predictors such as monthly dengue incidence, leukopenia, thrombocytopenia, rash, age and absence of nasal discharge significantly enhanced model sensitivity and specificity for diagnosing dengue. Adding more relevant clinical and epidemiological features consistently improved the models' ability to correctly identify dengue cases.</p><p><strong>Conclusions: </strong>Machine learning models, especially XGBoost and LightGBM, show promise for improving diagnostic accuracy for dengue using widely accessible clinical data, even in resource-limited settings. Future research should focus on developing user-friendly tools, such as mobile apps, web-based platforms, or clinical decision systems integrated into electronic health records, to implement these models in clinical practice and exploring their application for predicting dengue.</p>\",\"PeriodicalId\":23962,\"journal\":{\"name\":\"Tropical Medicine & International Health\",\"volume\":\" \",\"pages\":\"\"},\"PeriodicalIF\":2.3000,\"publicationDate\":\"2025-09-15\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"0\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"Tropical Medicine & International Health\",\"FirstCategoryId\":\"3\",\"ListUrlMain\":\"https://doi.org/10.1111/tmi.70036\",\"RegionNum\":4,\"RegionCategory\":\"医学\",\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"Q2\",\"JCRName\":\"PUBLIC, ENVIRONMENTAL & OCCUPATIONAL HEALTH\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"Tropical Medicine & International Health","FirstCategoryId":"3","ListUrlMain":"https://doi.org/10.1111/tmi.70036","RegionNum":4,"RegionCategory":"医学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q2","JCRName":"PUBLIC, ENVIRONMENTAL & OCCUPATIONAL HEALTH","Score":null,"Total":0}
Machine Learning for Improved Dengue Diagnosis in Puerto Rico.
Objectives: Diagnosing dengue accurately, especially in resource-limited settings, remains challenging due to overlapping symptoms with other febrile illnesses and limitations of current diagnostic methods. This study aimed to develop machine learning models that leverage readily available clinical data to improve diagnostic accuracy for dengue, potentially offering a more accessible and rapid diagnostic tool for healthcare providers.
Methods: We used data from the Sentinel Enhanced Dengue Surveillance System in Puerto Rico (May 2012-June 2024). The Sentinel Enhanced Dengue Surveillance System primarily targets acute febrile illness but also includes cases with other symptoms during outbreaks (e.g., Zika and COVID-19). Machine learning models (logistic regression, random forest, support vector machine, artificial neural network, adaptive boosting, light gradient boosting machine [LightGBM] and extreme gradient boosting [XGBoost]) were evaluated across different feature sets, including demographic, clinical, laboratory and epidemiological variables. Model performance was assessed using the area under the receiver operating characteristic curve (AUC), where higher AUC values indicate better performance in distinguishing dengue cases from non-dengue cases.
Results: Among 49,679 patients in SEDSS, 1640 laboratory-confirmed dengue cases were identified. The XGBoost and LightGBM models achieved the highest diagnostic accuracy, with AUCs exceeding 90%, particularly with comprehensive feature sets. Incorporating predictors such as monthly dengue incidence, leukopenia, thrombocytopenia, rash, age and absence of nasal discharge significantly enhanced model sensitivity and specificity for diagnosing dengue. Adding more relevant clinical and epidemiological features consistently improved the models' ability to correctly identify dengue cases.
Conclusions: Machine learning models, especially XGBoost and LightGBM, show promise for improving diagnostic accuracy for dengue using widely accessible clinical data, even in resource-limited settings. Future research should focus on developing user-friendly tools, such as mobile apps, web-based platforms, or clinical decision systems integrated into electronic health records, to implement these models in clinical practice and exploring their application for predicting dengue.
期刊介绍:
Tropical Medicine & International Health is published on behalf of the London School of Hygiene and Tropical Medicine, Swiss Tropical and Public Health Institute, Foundation Tropical Medicine and International Health, Belgian Institute of Tropical Medicine and Bernhard-Nocht-Institute for Tropical Medicine. Tropical Medicine & International Health is the official journal of the Federation of European Societies for Tropical Medicine and International Health (FESTMIH).