ChatGPT-Assisted Deep Learning Models for Influenza-Like Illness Prediction in Mainland China: Time Series Analysis.

IF 5.8 2区医学 Q1 HEALTH CARE SCIENCES & SERVICES

Journal of Medical Internet Research Pub Date : 2025-06-27 DOI:10.2196/74423

Weihong Huang, Wudi Wei, Xiaotao He, Baili Zhan, Xiaoting Xie, Meng Zhang, Shiyi Lai, Zongxiang Yuan, Jingzhen Lai, Rongfeng Chen, Junjun Jiang, Li Ye, Hao Liang

{"title":"ChatGPT-Assisted Deep Learning Models for Influenza-Like Illness Prediction in Mainland China: Time Series Analysis.","authors":"Weihong Huang, Wudi Wei, Xiaotao He, Baili Zhan, Xiaoting Xie, Meng Zhang, Shiyi Lai, Zongxiang Yuan, Jingzhen Lai, Rongfeng Chen, Junjun Jiang, Li Ye, Hao Liang","doi":"10.2196/74423","DOIUrl":null,"url":null,"abstract":"Background: Influenza in mainland China results in a large number of outpatient and emergency visits related to influenza-like illness (ILI) annually. While deep learning models show promise for improving influenza forecasting, their technical complexity remains a barrier to practical implementation. Large language models, such as ChatGPT, offer the potential to reduce these barriers by supporting automated code generation, debugging, and model optimization.Objective: This study aimed to evaluate the predictive performance of several deep learning models for ILI positive rates in mainland China and to explore the auxiliary role of ChatGPT-assisted development in facilitating model implementation.Methods: ILI positivity rate data spanning from 2014 to 2024 were obtained from the Chinese National Influenza Center (CNIC) database. In total, 5 deep learning architectures-long short-term memory (LSTM), neural basis expansion analysis for time series (N-BEATS), transformer, temporal fusion transformer (TFT), and time-series dense encoder (TiDE)-were developed using a ChatGPT-assisted workflow covering code generation, error debugging, and performance optimization. Models were trained on data from 2014 to 2023 and tested on holdout data from 2024 (weeks 1-39). Performance was evaluated using mean squared error (MSE), mean absolute error (MAE), and mean absolute percentage error (MAPE).Results: ILI trends exhibited clear seasonal patterns with winter peaks and summer troughs, alongside marked fluctuations during the COVID-19 pandemic period (2020-2022). All 5 deep learning models were successfully constructed, debugged, and optimized with the assistance of ChatGPT. Among the 5 models, TiDE achieved the best predictive performance nationally (MAE=5.551, MSE=43.976, MAPE=72.413%) and in the southern region (MAE=7.554, MSE=89.708, MAPE=74.475%). In the northern region, where forecasting proved more challenging, TiDE still performed best (MAE=4.131, MSE=28.922), although high percentage errors remained (MAPE>400%). N-BEATS demonstrated the second-best performance nationally (MAE=9.423) and showed greater stability in the north (MAE=6.325). In contrast, transformer and TFT consistently underperformed, with national MAE values of 10.613 and 12.538, respectively. TFT exhibited the highest deviation (national MAPE=169.29%). Extreme regional disparities were observed, particularly in northern China, where LSTM and TFT generated MAPE values exceeding 1918%, despite LSTM's moderate performance in the south (MAE=9.460).Conclusions: Deep learning models, particularly TiDE, demonstrate strong potential for accurate ILI forecasting across diverse regions of China. Furthermore, large language models like ChatGPT can substantially enhance modeling efficiency and accessibility by assisting nontechnical users in model development. These findings support the integration of AI-assisted workflows into epidemic prediction systems as a scalable approach for improving public health preparedness.","PeriodicalId":16337,"journal":{"name":"Journal of Medical Internet Research","volume":"27 ","pages":"e74423"},"PeriodicalIF":5.8000,"publicationDate":"2025-06-27","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC12227151/pdf/","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Journal of Medical Internet Research","FirstCategoryId":"3","ListUrlMain":"https://doi.org/10.2196/74423","RegionNum":2,"RegionCategory":"医学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q1","JCRName":"HEALTH CARE SCIENCES & SERVICES","Score":null,"Total":0}

引用次数: 0

Abstract

Background: Influenza in mainland China results in a large number of outpatient and emergency visits related to influenza-like illness (ILI) annually. While deep learning models show promise for improving influenza forecasting, their technical complexity remains a barrier to practical implementation. Large language models, such as ChatGPT, offer the potential to reduce these barriers by supporting automated code generation, debugging, and model optimization.

Objective: This study aimed to evaluate the predictive performance of several deep learning models for ILI positive rates in mainland China and to explore the auxiliary role of ChatGPT-assisted development in facilitating model implementation.

Methods: ILI positivity rate data spanning from 2014 to 2024 were obtained from the Chinese National Influenza Center (CNIC) database. In total, 5 deep learning architectures-long short-term memory (LSTM), neural basis expansion analysis for time series (N-BEATS), transformer, temporal fusion transformer (TFT), and time-series dense encoder (TiDE)-were developed using a ChatGPT-assisted workflow covering code generation, error debugging, and performance optimization. Models were trained on data from 2014 to 2023 and tested on holdout data from 2024 (weeks 1-39). Performance was evaluated using mean squared error (MSE), mean absolute error (MAE), and mean absolute percentage error (MAPE).

Results: ILI trends exhibited clear seasonal patterns with winter peaks and summer troughs, alongside marked fluctuations during the COVID-19 pandemic period (2020-2022). All 5 deep learning models were successfully constructed, debugged, and optimized with the assistance of ChatGPT. Among the 5 models, TiDE achieved the best predictive performance nationally (MAE=5.551, MSE=43.976, MAPE=72.413%) and in the southern region (MAE=7.554, MSE=89.708, MAPE=74.475%). In the northern region, where forecasting proved more challenging, TiDE still performed best (MAE=4.131, MSE=28.922), although high percentage errors remained (MAPE>400%). N-BEATS demonstrated the second-best performance nationally (MAE=9.423) and showed greater stability in the north (MAE=6.325). In contrast, transformer and TFT consistently underperformed, with national MAE values of 10.613 and 12.538, respectively. TFT exhibited the highest deviation (national MAPE=169.29%). Extreme regional disparities were observed, particularly in northern China, where LSTM and TFT generated MAPE values exceeding 1918%, despite LSTM's moderate performance in the south (MAE=9.460).

Conclusions: Deep learning models, particularly TiDE, demonstrate strong potential for accurate ILI forecasting across diverse regions of China. Furthermore, large language models like ChatGPT can substantially enhance modeling efficiency and accessibility by assisting nontechnical users in model development. These findings support the integration of AI-assisted workflows into epidemic prediction systems as a scalable approach for improving public health preparedness.

查看原文本刊更多论文

chatgpt辅助深度学习模型在中国大陆流感样疾病预测中的应用：时间序列分析。

背景：在中国大陆，流感每年导致大量与流感样疾病（ILI）相关的门诊和急诊。虽然深度学习模型有望改善流感预测，但其技术复杂性仍然是实际实施的障碍。大型语言模型，如ChatGPT，通过支持自动代码生成、调试和模型优化，提供了减少这些障碍的潜力。目的：本研究旨在评估几种深度学习模型对中国大陆ILI阳性率的预测性能，并探讨chatgpt辅助开发在促进模型实施中的辅助作用。方法：从中国国家流感中心（CNIC）数据库中获取2014 - 2024年ILI阳性率数据。总共有5个深度学习架构——长短期记忆（LSTM）、时间序列神经基扩展分析（N-BEATS）、变压器、时间融合变压器（TFT）和时间序列密集编码器（TiDE）——使用chatgpt辅助的工作流程开发，包括代码生成、错误调试和性能优化。模型在2014年至2023年的数据上进行了训练，并在2024年（第1-39周）的坚守数据上进行了测试。使用均方误差（MSE）、平均绝对误差（MAE）和平均绝对百分比误差（MAPE）评估性能。结果：ILI趋势表现出明显的季节性特征，冬季高峰和夏季低谷，并在2019冠状病毒病大流行期间（2020-2022年）出现明显波动。在ChatGPT的帮助下，成功构建、调试和优化了5个深度学习模型。在5个模型中，全国（MAE=5.551， MSE=43.976， MAPE=72.413%）和南部地区（MAE=7.554， MSE=89.708， MAPE=74.475%）的预测效果最好。在北部地区，预测被证明更具挑战性，尽管仍然存在较高的误差率（MAPE>400%），但TiDE仍然表现最佳（MAE=4.131， MSE=28.922）。N-BEATS在全国表现第二好（MAE=9.423），在北方表现出更大的稳定性（MAE=6.325）。相比之下，变压器和TFT一直表现不佳，其全国MAE值分别为10.613和12.538。TFT偏差最大（全国MAPE=169.29%）。极端的区域差异，特别是在中国北方，尽管LSTM在南方表现温和（MAE=9.460），但LSTM和TFT产生的MAPE值超过1918%。结论：深度学习模型，特别是潮汐模型，显示出在中国不同地区准确预测ILI的强大潜力。此外，像ChatGPT这样的大型语言模型可以通过帮助非技术用户进行模型开发，从而大大提高建模效率和可访问性。这些发现支持将人工智能辅助工作流程整合到流行病预测系统中，作为改进公共卫生准备的可扩展方法。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

求助全文

约1分钟内获得全文求助全文

来源期刊

Journal of Medical Internet Research 医学-卫生保健

CiteScore

14.40

自引率

5.40%

发文量

654

审稿时长

1 months

期刊介绍： The Journal of Medical Internet Research (JMIR) is a highly respected publication in the field of health informatics and health services. With a founding date in 1999, JMIR has been a pioneer in the field for over two decades. As a leader in the industry, the journal focuses on digital health, data science, health informatics, and emerging technologies for health, medicine, and biomedical research. It is recognized as a top publication in these disciplines, ranking in the first quartile (Q1) by Impact Factor. Notably, JMIR holds the prestigious position of being ranked #1 on Google Scholar within the "Medical Informatics" discipline.