Advancing real-time infectious disease forecasting using large language models

IF 18.3 Q1 COMPUTER SCIENCE, INTERDISCIPLINARY APPLICATIONS
Hongru Du, Yang Zhao, Jianan Zhao, Shaochong Xu, Xihong Lin, Yiran Chen, Lauren M. Gardner, Hao ‘Frank’ Yang
{"title":"Advancing real-time infectious disease forecasting using large language models","authors":"Hongru Du, Yang Zhao, Jianan Zhao, Shaochong Xu, Xihong Lin, Yiran Chen, Lauren M. Gardner, Hao ‘Frank’ Yang","doi":"10.1038/s43588-025-00798-6","DOIUrl":null,"url":null,"abstract":"Forecasting the short-term spread of an ongoing disease outbreak poses a challenge owing to the complexity of contributing factors, some of which can be characterized through interlinked, multi-modality variables, and the intersection of public policy and human behavior. Here we introduce PandemicLLM, a framework with multi-modal large language models (LLMs) that reformulates real-time forecasting of disease spread as a text-reasoning problem, with the ability to incorporate real-time, complex, non-numerical information. This approach, through an artificial intelligence–human cooperative prompt design and time-series representation learning, encodes multi-modal data for LLMs. The model is applied to the COVID-19 pandemic, and trained to utilize textual public health policies, genomic surveillance, spatial and epidemiological time-series data, and is tested across all 50 states of the United States for a duration of 19 months. PandemicLLM opens avenues for incorporating various pandemic-related data in heterogeneous formats and shows performance benefits over existing models. PandemicLLM adapts the large language model to predict disease trends by converting diverse disease-relevant data into text. It responds to new variants in real time, offering robust, interpretable forecasts for effective public health responses.","PeriodicalId":74246,"journal":{"name":"Nature computational science","volume":"5 6","pages":"467-480"},"PeriodicalIF":18.3000,"publicationDate":"2025-06-06","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Nature computational science","FirstCategoryId":"1085","ListUrlMain":"https://www.nature.com/articles/s43588-025-00798-6","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q1","JCRName":"COMPUTER SCIENCE, INTERDISCIPLINARY APPLICATIONS","Score":null,"Total":0}
引用次数: 0

Abstract

Forecasting the short-term spread of an ongoing disease outbreak poses a challenge owing to the complexity of contributing factors, some of which can be characterized through interlinked, multi-modality variables, and the intersection of public policy and human behavior. Here we introduce PandemicLLM, a framework with multi-modal large language models (LLMs) that reformulates real-time forecasting of disease spread as a text-reasoning problem, with the ability to incorporate real-time, complex, non-numerical information. This approach, through an artificial intelligence–human cooperative prompt design and time-series representation learning, encodes multi-modal data for LLMs. The model is applied to the COVID-19 pandemic, and trained to utilize textual public health policies, genomic surveillance, spatial and epidemiological time-series data, and is tested across all 50 states of the United States for a duration of 19 months. PandemicLLM opens avenues for incorporating various pandemic-related data in heterogeneous formats and shows performance benefits over existing models. PandemicLLM adapts the large language model to predict disease trends by converting diverse disease-relevant data into text. It responds to new variants in real time, offering robust, interpretable forecasts for effective public health responses.

Abstract Image

使用大型语言模型推进实时传染病预测。
预测正在发生的疾病爆发的短期传播是一项挑战,因为促成因素复杂,其中一些因素可以通过相互关联的多模态变量以及公共政策和人类行为的交集来表征。在这里,我们介绍了PandemicLLM,这是一个具有多模态大语言模型(llm)的框架,它将疾病传播的实时预测重新制定为文本推理问题,具有整合实时,复杂,非数字信息的能力。该方法通过人工智能-人类合作提示设计和时间序列表示学习,对llm的多模态数据进行编码。该模型应用于2019冠状病毒病大流行,并经过培训,可以利用文本公共卫生政策、基因组监测、空间和流行病学时间序列数据,并在美国所有50个州进行了为期19个月的测试。PandemicLLM为以异构格式合并各种与流行病有关的数据开辟了途径,并显示出优于现有模型的性能优势。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
求助全文
约1分钟内获得全文 求助全文
来源期刊
CiteScore
11.70
自引率
0.00%
发文量
0
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
确定
请完成安全验证×
copy
已复制链接
快去分享给好友吧!
我知道了
右上角分享
点击右上角分享
0
联系我们:info@booksci.cn Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。 Copyright © 2023 布克学术 All rights reserved.
京ICP备2023020795号-1
ghs 京公网安备 11010802042870号
Book学术文献互助
Book学术文献互助群
群 号:604180095
Book学术官方微信