Haowei Wang , Kin On Kwok , Ruiyun Li , Steven Riley
{"title":"使用序数机器学习方法预测英格兰地区COVID-19住院率。","authors":"Haowei Wang , Kin On Kwok , Ruiyun Li , Steven Riley","doi":"10.1016/j.epidem.2025.100856","DOIUrl":null,"url":null,"abstract":"<div><h3>Background</h3><div>The COVID-19 pandemic caused substantial pressure on healthcare, with many systems needing to prepare for and mitigate the consequences of surges in demand caused by multiple overlapping waves of infections. Therefore, public health agencies and health system managers also benefitted from short-term forecasts for respiratory infections that allowed them to manage services. While quantitative forecasts treating hospital admissions as continuous variables existed, many health managers prefer discrete levels of demand, similar to approaches used in weather and flooding. However, effective tools for generating precise sub-national forecasts remained limited.</div></div><div><h3>Methods</h3><div>We forecast regional COVID-19 hospitalisations in England, using the period from March 2020 to December 2021 for training and evaluating predictions using data from January to December 2022. We transform regional admission counts into an ordinal variable using n-tile and n-uniform methods. We further developed a method based on XGBoost, and used previously for influenza, to enable it to exploit the ordering information in ordinal hospital admission levels. We incorporated different types of data as predictors: epidemiological data including weekly region COVID-19 cases and hospital admissions, weather conditions and mobility data for multiple categories of locations. The impact of different discretisation methods and the number of ordinal levels was also considered.</div></div><div><h3>Results</h3><div>We found that mobility data brings about a more substantial improvement in predictive performance than relying only on epidemiological data and the inclusion of weather data. When both weather and mobility data are used in addition to epidemiological data, the results are very similar to models with only epidemiological data and mobility data. These results are robust in terms of the number of levels chosen for the forecast target.</div></div><div><h3>Conclusion</h3><div>Accurate ordinal forecasts of COVID-19 hospitalisations were obtained using XGBoost and mobility data. While uniform ordinal levels showed higher apparent accuracy, we recommend n-tile ordinal levels which contain far richer information.</div></div>","PeriodicalId":49206,"journal":{"name":"Epidemics","volume":"53 ","pages":"Article 100856"},"PeriodicalIF":2.4000,"publicationDate":"2025-09-11","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":"{\"title\":\"Forecasting regional COVID-19 hospitalisation in England using ordinal machine learning method\",\"authors\":\"Haowei Wang , Kin On Kwok , Ruiyun Li , Steven Riley\",\"doi\":\"10.1016/j.epidem.2025.100856\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"<div><h3>Background</h3><div>The COVID-19 pandemic caused substantial pressure on healthcare, with many systems needing to prepare for and mitigate the consequences of surges in demand caused by multiple overlapping waves of infections. Therefore, public health agencies and health system managers also benefitted from short-term forecasts for respiratory infections that allowed them to manage services. While quantitative forecasts treating hospital admissions as continuous variables existed, many health managers prefer discrete levels of demand, similar to approaches used in weather and flooding. However, effective tools for generating precise sub-national forecasts remained limited.</div></div><div><h3>Methods</h3><div>We forecast regional COVID-19 hospitalisations in England, using the period from March 2020 to December 2021 for training and evaluating predictions using data from January to December 2022. We transform regional admission counts into an ordinal variable using n-tile and n-uniform methods. We further developed a method based on XGBoost, and used previously for influenza, to enable it to exploit the ordering information in ordinal hospital admission levels. We incorporated different types of data as predictors: epidemiological data including weekly region COVID-19 cases and hospital admissions, weather conditions and mobility data for multiple categories of locations. The impact of different discretisation methods and the number of ordinal levels was also considered.</div></div><div><h3>Results</h3><div>We found that mobility data brings about a more substantial improvement in predictive performance than relying only on epidemiological data and the inclusion of weather data. When both weather and mobility data are used in addition to epidemiological data, the results are very similar to models with only epidemiological data and mobility data. These results are robust in terms of the number of levels chosen for the forecast target.</div></div><div><h3>Conclusion</h3><div>Accurate ordinal forecasts of COVID-19 hospitalisations were obtained using XGBoost and mobility data. While uniform ordinal levels showed higher apparent accuracy, we recommend n-tile ordinal levels which contain far richer information.</div></div>\",\"PeriodicalId\":49206,\"journal\":{\"name\":\"Epidemics\",\"volume\":\"53 \",\"pages\":\"Article 100856\"},\"PeriodicalIF\":2.4000,\"publicationDate\":\"2025-09-11\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"0\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"Epidemics\",\"FirstCategoryId\":\"3\",\"ListUrlMain\":\"https://www.sciencedirect.com/science/article/pii/S1755436525000441\",\"RegionNum\":3,\"RegionCategory\":\"医学\",\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"Q2\",\"JCRName\":\"INFECTIOUS DISEASES\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"Epidemics","FirstCategoryId":"3","ListUrlMain":"https://www.sciencedirect.com/science/article/pii/S1755436525000441","RegionNum":3,"RegionCategory":"医学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q2","JCRName":"INFECTIOUS DISEASES","Score":null,"Total":0}
Forecasting regional COVID-19 hospitalisation in England using ordinal machine learning method
Background
The COVID-19 pandemic caused substantial pressure on healthcare, with many systems needing to prepare for and mitigate the consequences of surges in demand caused by multiple overlapping waves of infections. Therefore, public health agencies and health system managers also benefitted from short-term forecasts for respiratory infections that allowed them to manage services. While quantitative forecasts treating hospital admissions as continuous variables existed, many health managers prefer discrete levels of demand, similar to approaches used in weather and flooding. However, effective tools for generating precise sub-national forecasts remained limited.
Methods
We forecast regional COVID-19 hospitalisations in England, using the period from March 2020 to December 2021 for training and evaluating predictions using data from January to December 2022. We transform regional admission counts into an ordinal variable using n-tile and n-uniform methods. We further developed a method based on XGBoost, and used previously for influenza, to enable it to exploit the ordering information in ordinal hospital admission levels. We incorporated different types of data as predictors: epidemiological data including weekly region COVID-19 cases and hospital admissions, weather conditions and mobility data for multiple categories of locations. The impact of different discretisation methods and the number of ordinal levels was also considered.
Results
We found that mobility data brings about a more substantial improvement in predictive performance than relying only on epidemiological data and the inclusion of weather data. When both weather and mobility data are used in addition to epidemiological data, the results are very similar to models with only epidemiological data and mobility data. These results are robust in terms of the number of levels chosen for the forecast target.
Conclusion
Accurate ordinal forecasts of COVID-19 hospitalisations were obtained using XGBoost and mobility data. While uniform ordinal levels showed higher apparent accuracy, we recommend n-tile ordinal levels which contain far richer information.
期刊介绍:
Epidemics publishes papers on infectious disease dynamics in the broadest sense. Its scope covers both within-host dynamics of infectious agents and dynamics at the population level, particularly the interaction between the two. Areas of emphasis include: spread, transmission, persistence, implications and population dynamics of infectious diseases; population and public health as well as policy aspects of control and prevention; dynamics at the individual level; interaction with the environment, ecology and evolution of infectious diseases, as well as population genetics of infectious agents.