基于贝叶斯优化lstm的新冠肺炎多步预测

Tianqian Chen, Shuyu Chen, Shan Mei, Shuqi An, Xiaohan Yuan, Yuwen Lu
{"title":"基于贝叶斯优化lstm的新冠肺炎多步预测","authors":"Tianqian Chen, Shuyu Chen, Shan Mei, Shuqi An, Xiaohan Yuan, Yuwen Lu","doi":"10.1145/3459104.3459116","DOIUrl":null,"url":null,"abstract":"The multistep prediction of new Corona Virus Disease (COVID-19) cases plays a vital role during the epidemic control period, and the Long Short-Term Memory (LSTM) based time series analysis model is the most frequently used among many prediction methods. But whether it is the cumulative error of the multistep prediction or the instability of the new case data of the COVID-19 make the performance of LSTM in this task not so good. In this paper, we selected three countries with more severe COVID-19 epidemics—India, Russia, and Chile, to predict new cases in the next 15 days with different multistep LSTM network models, and use Bayesian Optimization to explore the optimal hyperparameter space. The results show that: a) the performance of Recursive Prediction LSTM is the best (Mean Absolute Percentage Error, MAPE was reduced to 14.88%, 6.46%, and 16.31% for the three countries respectively), Encoder Decoder LSTM is second (15.52%, 19.61%, 19.87%), and the effect of vector output LSTM is the worst (23.55%, 26.82%, 19.57%); b) there are obvious extremely poor areas in the hyperparameter space, and the Bayesian Optimizer can focus on the good areas to avoid cost of tuning parameters based on bad hyperparameters; c) the data of new cases of COVID-19 in different countries have great differences in the hyperparameter expectations for the model. The bad area of hyperparameters and different expectations are likely to be one of the reasons why the COVID-19 data of different countries is hard to train jointly.","PeriodicalId":142284,"journal":{"name":"2021 International Symposium on Electrical, Electronics and Information Engineering","volume":"40 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2021-02-19","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":"{\"title\":\"Multistep Forecasting of New COVID-19 Cases Based on LSTMs Using Bayesian Optimization\",\"authors\":\"Tianqian Chen, Shuyu Chen, Shan Mei, Shuqi An, Xiaohan Yuan, Yuwen Lu\",\"doi\":\"10.1145/3459104.3459116\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"The multistep prediction of new Corona Virus Disease (COVID-19) cases plays a vital role during the epidemic control period, and the Long Short-Term Memory (LSTM) based time series analysis model is the most frequently used among many prediction methods. But whether it is the cumulative error of the multistep prediction or the instability of the new case data of the COVID-19 make the performance of LSTM in this task not so good. In this paper, we selected three countries with more severe COVID-19 epidemics—India, Russia, and Chile, to predict new cases in the next 15 days with different multistep LSTM network models, and use Bayesian Optimization to explore the optimal hyperparameter space. The results show that: a) the performance of Recursive Prediction LSTM is the best (Mean Absolute Percentage Error, MAPE was reduced to 14.88%, 6.46%, and 16.31% for the three countries respectively), Encoder Decoder LSTM is second (15.52%, 19.61%, 19.87%), and the effect of vector output LSTM is the worst (23.55%, 26.82%, 19.57%); b) there are obvious extremely poor areas in the hyperparameter space, and the Bayesian Optimizer can focus on the good areas to avoid cost of tuning parameters based on bad hyperparameters; c) the data of new cases of COVID-19 in different countries have great differences in the hyperparameter expectations for the model. The bad area of hyperparameters and different expectations are likely to be one of the reasons why the COVID-19 data of different countries is hard to train jointly.\",\"PeriodicalId\":142284,\"journal\":{\"name\":\"2021 International Symposium on Electrical, Electronics and Information Engineering\",\"volume\":\"40 1\",\"pages\":\"0\"},\"PeriodicalIF\":0.0000,\"publicationDate\":\"2021-02-19\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"0\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"2021 International Symposium on Electrical, Electronics and Information Engineering\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/10.1145/3459104.3459116\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"2021 International Symposium on Electrical, Electronics and Information Engineering","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1145/3459104.3459116","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 0

摘要

新型冠状病毒病(COVID-19)病例的多步骤预测在疫情控制期间起着至关重要的作用,而基于LSTM的时间序列分析模型是众多预测方法中最常用的一种。但无论是多步预测的累积误差,还是新冠肺炎病例数据的不稳定性,都使得LSTM在这项任务中的表现不尽如人意。本文选取疫情较为严重的三个国家——印度、俄罗斯和智利,采用不同的多步LSTM网络模型预测未来15天的新增病例,并利用贝叶斯优化方法探索最优超参数空间。结果表明:a)递归预测LSTM的性能最好(三个国家的Mean Absolute Percentage Error、MAPE分别降低到14.88%、6.46%和16.31%),Encoder - Decoder LSTM次之(15.52%、19.61%、19.87%),vector output LSTM效果最差(23.55%、26.82%、19.57%);b)超参数空间中存在明显的极差区域,贝叶斯优化器可以专注于较好的区域,避免了基于较差超参数调优参数的代价;c)不同国家新发病例数据对模型的超参数期望存在较大差异。超参数的坏区和不同的预期可能是不同国家COVID-19数据难以联合训练的原因之一。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
Multistep Forecasting of New COVID-19 Cases Based on LSTMs Using Bayesian Optimization
The multistep prediction of new Corona Virus Disease (COVID-19) cases plays a vital role during the epidemic control period, and the Long Short-Term Memory (LSTM) based time series analysis model is the most frequently used among many prediction methods. But whether it is the cumulative error of the multistep prediction or the instability of the new case data of the COVID-19 make the performance of LSTM in this task not so good. In this paper, we selected three countries with more severe COVID-19 epidemics—India, Russia, and Chile, to predict new cases in the next 15 days with different multistep LSTM network models, and use Bayesian Optimization to explore the optimal hyperparameter space. The results show that: a) the performance of Recursive Prediction LSTM is the best (Mean Absolute Percentage Error, MAPE was reduced to 14.88%, 6.46%, and 16.31% for the three countries respectively), Encoder Decoder LSTM is second (15.52%, 19.61%, 19.87%), and the effect of vector output LSTM is the worst (23.55%, 26.82%, 19.57%); b) there are obvious extremely poor areas in the hyperparameter space, and the Bayesian Optimizer can focus on the good areas to avoid cost of tuning parameters based on bad hyperparameters; c) the data of new cases of COVID-19 in different countries have great differences in the hyperparameter expectations for the model. The bad area of hyperparameters and different expectations are likely to be one of the reasons why the COVID-19 data of different countries is hard to train jointly.
求助全文
通过发布文献求助,成功后即可免费获取论文全文。 去求助
来源期刊
自引率
0.00%
发文量
0
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
确定
请完成安全验证×
copy
已复制链接
快去分享给好友吧!
我知道了
右上角分享
点击右上角分享
0
联系我们:info@booksci.cn Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。 Copyright © 2023 布克学术 All rights reserved.
京ICP备2023020795号-1
ghs 京公网安备 11010802042870号
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术官方微信