Valley-Forecast: Forecasting Coccidioidomycosis incidence via enhanced LSTM models trained on comprehensive meteorological data

IF 4 2区 医学 Q2 COMPUTER SCIENCE, INTERDISCIPLINARY APPLICATIONS
Leif Huender , Mary Everett , John Shovic
{"title":"Valley-Forecast: Forecasting Coccidioidomycosis incidence via enhanced LSTM models trained on comprehensive meteorological data","authors":"Leif Huender ,&nbsp;Mary Everett ,&nbsp;John Shovic","doi":"10.1016/j.jbi.2025.104774","DOIUrl":null,"url":null,"abstract":"<div><div>Coccidioidomycosis (cocci), or more commonly known as Valley Fever, is a fungal infection caused by Coccidioides species that poses a significant public health challenge, particularly in the semi-arid regions of the Americas, with notable prevalence in California and Arizona. Previous epidemiological studies have established a correlation between cocci incidence and regional weather patterns, indicating that climatic factors influence the fungus’s life cycle and subsequent disease transmission. This study hypothesizes that Long Short-Term Memory (LSTM) and extended Long Short-Term Memory (xLSTM) models, known for their ability to capture long-term dependencies in time-series data, can outperform traditional statistical methods in predicting cocci outbreak cases. Our research analyzed daily meteorological features from 2001 to 2022 across 48 counties in California, covering diverse microclimates and cocci incidence. The study evaluated 846 LSTM models and 176 xLSTM models with various fine-tuning metrics. To ensure the reliability of our results, these advanced neural network architectures are cross analyzed with Baseline Regression and Multi-Layer Perceptron (MLP) models, providing a comprehensive comparative framework. We found that LSTM-type architectures outperform traditional methods, with xLSTM achieving the lowest test RMSE of 282.98 (95% CI: 259.2-306.8) compared to the baseline’s 468.51 (95% CI: 458.2-478.8), demonstrating a reduction of 39.60% in prediction error. While both LSTM (283.50, 95% CI: 259.7-307.3) and MLP (293.14, 95% CI: 268.3-318.0) also showed substantial improvements over the baseline, the overlapping confidence intervals suggest similar predictive capabilities among the advanced models. This improvement in predictive capability suggests a strong correlation between temporal microclimatic variations and regional cocci incidences. The increased predictive power of these models has significant public health implications, potentially informing strategies for cocci outbreak prevention and control. Moreover, this study represents the first application of the novel xLSTM architecture in epidemiological research and pioneers the evaluation of modern machine learning methods’ accuracy in predicting cocci outbreaks. These findings contribute to the ongoing efforts to address cocci, offering a new approach to understanding and potentially mitigating the impact of the disease in affected regions.</div></div>","PeriodicalId":15263,"journal":{"name":"Journal of Biomedical Informatics","volume":"162 ","pages":"Article 104774"},"PeriodicalIF":4.0000,"publicationDate":"2025-02-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Journal of Biomedical Informatics","FirstCategoryId":"3","ListUrlMain":"https://www.sciencedirect.com/science/article/pii/S1532046425000036","RegionNum":2,"RegionCategory":"医学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q2","JCRName":"COMPUTER SCIENCE, INTERDISCIPLINARY APPLICATIONS","Score":null,"Total":0}
引用次数: 0

Abstract

Coccidioidomycosis (cocci), or more commonly known as Valley Fever, is a fungal infection caused by Coccidioides species that poses a significant public health challenge, particularly in the semi-arid regions of the Americas, with notable prevalence in California and Arizona. Previous epidemiological studies have established a correlation between cocci incidence and regional weather patterns, indicating that climatic factors influence the fungus’s life cycle and subsequent disease transmission. This study hypothesizes that Long Short-Term Memory (LSTM) and extended Long Short-Term Memory (xLSTM) models, known for their ability to capture long-term dependencies in time-series data, can outperform traditional statistical methods in predicting cocci outbreak cases. Our research analyzed daily meteorological features from 2001 to 2022 across 48 counties in California, covering diverse microclimates and cocci incidence. The study evaluated 846 LSTM models and 176 xLSTM models with various fine-tuning metrics. To ensure the reliability of our results, these advanced neural network architectures are cross analyzed with Baseline Regression and Multi-Layer Perceptron (MLP) models, providing a comprehensive comparative framework. We found that LSTM-type architectures outperform traditional methods, with xLSTM achieving the lowest test RMSE of 282.98 (95% CI: 259.2-306.8) compared to the baseline’s 468.51 (95% CI: 458.2-478.8), demonstrating a reduction of 39.60% in prediction error. While both LSTM (283.50, 95% CI: 259.7-307.3) and MLP (293.14, 95% CI: 268.3-318.0) also showed substantial improvements over the baseline, the overlapping confidence intervals suggest similar predictive capabilities among the advanced models. This improvement in predictive capability suggests a strong correlation between temporal microclimatic variations and regional cocci incidences. The increased predictive power of these models has significant public health implications, potentially informing strategies for cocci outbreak prevention and control. Moreover, this study represents the first application of the novel xLSTM architecture in epidemiological research and pioneers the evaluation of modern machine learning methods’ accuracy in predicting cocci outbreaks. These findings contribute to the ongoing efforts to address cocci, offering a new approach to understanding and potentially mitigating the impact of the disease in affected regions.

Abstract Image

山谷预报:利用综合气象数据训练的增强型LSTM模型预测球孢子菌病发病率。
球孢子菌病(cocci),或更常见的谷热,是一种由球孢子菌引起的真菌感染,对公共卫生构成重大挑战,特别是在美洲的半干旱地区,加利福尼亚州和亚利桑那州的流行率很高。以前的流行病学研究已经建立了球菌发病率与区域天气模式之间的相关性,表明气候因素影响真菌的生命周期和随后的疾病传播。该研究假设长短期记忆(LSTM)和扩展长短期记忆(xLSTM)模型在预测球菌爆发病例方面可以优于传统的统计方法,它们以能够捕获时间序列数据中的长期依赖关系而闻名。我们的研究分析了2001年至2022年加州48个县的日常气象特征,涵盖了不同的小气候和球菌发病率。该研究使用各种微调指标评估了846个LSTM模型和176个xLSTM模型。为了确保我们的结果的可靠性,这些先进的神经网络架构与基线回归和多层感知器(MLP)模型交叉分析,提供了一个全面的比较框架。我们发现lstm类型的架构优于传统方法,与基线的468.51 (95% CI: 458.2-478.8)相比,xLSTM实现了最低的测试RMSE 282.98 (95% CI: 259.2-306.8),表明预测误差减少了39.60%。虽然LSTM (283.50, 95% CI: 259.7-307.3)和MLP (293.14, 95% CI: 268.3-318.0)在基线上也显示出实质性的改进,但重叠的置信区间表明先进模型之间的预测能力相似。这种预测能力的提高表明,时间小气候变化与区域球菌发病率之间存在很强的相关性。这些模型预测能力的增强具有重要的公共卫生意义,可能为预防和控制球菌爆发的战略提供信息。此外,该研究代表了新颖的xLSTM架构在流行病学研究中的首次应用,并开创了现代机器学习方法在预测球菌爆发方面的准确性评估。这些发现有助于正在进行的应对球菌的努力,提供了一种新的方法来了解和可能减轻受影响地区的疾病影响。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
求助全文
约1分钟内获得全文 求助全文
来源期刊
Journal of Biomedical Informatics
Journal of Biomedical Informatics 医学-计算机:跨学科应用
CiteScore
8.90
自引率
6.70%
发文量
243
审稿时长
32 days
期刊介绍: The Journal of Biomedical Informatics reflects a commitment to high-quality original research papers, reviews, and commentaries in the area of biomedical informatics methodology. Although we publish articles motivated by applications in the biomedical sciences (for example, clinical medicine, health care, population health, and translational bioinformatics), the journal emphasizes reports of new methodologies and techniques that have general applicability and that form the basis for the evolving science of biomedical informatics. Articles on medical devices; evaluations of implemented systems (including clinical trials of information technologies); or papers that provide insight into a biological process, a specific disease, or treatment options would generally be more suitable for publication in other venues. Papers on applications of signal processing and image analysis are often more suitable for biomedical engineering journals or other informatics journals, although we do publish papers that emphasize the information management and knowledge representation/modeling issues that arise in the storage and use of biological signals and images. System descriptions are welcome if they illustrate and substantiate the underlying methodology that is the principal focus of the report and an effort is made to address the generalizability and/or range of application of that methodology. Note also that, given the international nature of JBI, papers that deal with specific languages other than English, or with country-specific health systems or approaches, are acceptable for JBI only if they offer generalizable lessons that are relevant to the broad JBI readership, regardless of their country, language, culture, or health system.
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
确定
请完成安全验证×
copy
已复制链接
快去分享给好友吧!
我知道了
右上角分享
点击右上角分享
0
联系我们:info@booksci.cn Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。 Copyright © 2023 布克学术 All rights reserved.
京ICP备2023020795号-1
ghs 京公网安备 11010802042870号
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术官方微信