{"title":"提高流感样疾病预测性能的聚类-聚合-池(CAP)集成算法","authors":"Ningxi Wei , Xinze Zhou , Wei-Min Huang , Thomas McAndrew","doi":"10.1016/j.epidem.2025.100832","DOIUrl":null,"url":null,"abstract":"<div><div>Seasonal influenza causes on average 425,000 hospitalizations and 32,000 deaths per year in the United States. Forecasts of influenza-like illness (ILI) — a surrogate for the proportion of patients infected with influenza — support public health decision making. The goal of an ensemble forecast of ILI is to increase accuracy and calibration compared to individual forecasts and to provide a single, cohesive prediction of future influenza. However, an ensemble may be composed of models that produce similar forecasts, causing issues with ensemble forecast performance and non-identifiability. To improve upon the above issues we propose a novel Cluster-Aggregate-Pool or ‘CAP’ ensemble algorithm that first groups together individual forecasts into clusters, aggregates forecasts that belong to the same cluster into a single forecast (called a cluster forecast), and then pools together cluster forecasts via a linear pool. We evaluated this algorithm on a benchmark dataset of 7 seasons of ILI plus forecasts generated by 27 individual models as part of the FluSight project. When compared to a non-CAP approach, we find that a CAP ensemble improves calibration by approximately 10% while maintaining similar accuracy to non-CAP alternatives. In addition, our CAP algorithm (i) generalizes past ensemble work associated with influenza forecasting and introduces a framework for future ensemble work, (ii) automatically accounts for missing forecasts from individual models, (iii) allows public health officials to participate in the ensemble by assigning individual models to clusters, and (iv) provide an additional signal about when peak influenza may be near.</div></div>","PeriodicalId":49206,"journal":{"name":"Epidemics","volume":"52 ","pages":"Article 100832"},"PeriodicalIF":2.4000,"publicationDate":"2025-06-03","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":"{\"title\":\"A Cluster-Aggregate-Pool (CAP) ensemble algorithm for improved forecast performance of influenza-like illness\",\"authors\":\"Ningxi Wei , Xinze Zhou , Wei-Min Huang , Thomas McAndrew\",\"doi\":\"10.1016/j.epidem.2025.100832\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"<div><div>Seasonal influenza causes on average 425,000 hospitalizations and 32,000 deaths per year in the United States. Forecasts of influenza-like illness (ILI) — a surrogate for the proportion of patients infected with influenza — support public health decision making. The goal of an ensemble forecast of ILI is to increase accuracy and calibration compared to individual forecasts and to provide a single, cohesive prediction of future influenza. However, an ensemble may be composed of models that produce similar forecasts, causing issues with ensemble forecast performance and non-identifiability. To improve upon the above issues we propose a novel Cluster-Aggregate-Pool or ‘CAP’ ensemble algorithm that first groups together individual forecasts into clusters, aggregates forecasts that belong to the same cluster into a single forecast (called a cluster forecast), and then pools together cluster forecasts via a linear pool. We evaluated this algorithm on a benchmark dataset of 7 seasons of ILI plus forecasts generated by 27 individual models as part of the FluSight project. When compared to a non-CAP approach, we find that a CAP ensemble improves calibration by approximately 10% while maintaining similar accuracy to non-CAP alternatives. In addition, our CAP algorithm (i) generalizes past ensemble work associated with influenza forecasting and introduces a framework for future ensemble work, (ii) automatically accounts for missing forecasts from individual models, (iii) allows public health officials to participate in the ensemble by assigning individual models to clusters, and (iv) provide an additional signal about when peak influenza may be near.</div></div>\",\"PeriodicalId\":49206,\"journal\":{\"name\":\"Epidemics\",\"volume\":\"52 \",\"pages\":\"Article 100832\"},\"PeriodicalIF\":2.4000,\"publicationDate\":\"2025-06-03\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"0\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"Epidemics\",\"FirstCategoryId\":\"3\",\"ListUrlMain\":\"https://www.sciencedirect.com/science/article/pii/S1755436525000209\",\"RegionNum\":3,\"RegionCategory\":\"医学\",\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"Q2\",\"JCRName\":\"INFECTIOUS DISEASES\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"Epidemics","FirstCategoryId":"3","ListUrlMain":"https://www.sciencedirect.com/science/article/pii/S1755436525000209","RegionNum":3,"RegionCategory":"医学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q2","JCRName":"INFECTIOUS DISEASES","Score":null,"Total":0}
A Cluster-Aggregate-Pool (CAP) ensemble algorithm for improved forecast performance of influenza-like illness
Seasonal influenza causes on average 425,000 hospitalizations and 32,000 deaths per year in the United States. Forecasts of influenza-like illness (ILI) — a surrogate for the proportion of patients infected with influenza — support public health decision making. The goal of an ensemble forecast of ILI is to increase accuracy and calibration compared to individual forecasts and to provide a single, cohesive prediction of future influenza. However, an ensemble may be composed of models that produce similar forecasts, causing issues with ensemble forecast performance and non-identifiability. To improve upon the above issues we propose a novel Cluster-Aggregate-Pool or ‘CAP’ ensemble algorithm that first groups together individual forecasts into clusters, aggregates forecasts that belong to the same cluster into a single forecast (called a cluster forecast), and then pools together cluster forecasts via a linear pool. We evaluated this algorithm on a benchmark dataset of 7 seasons of ILI plus forecasts generated by 27 individual models as part of the FluSight project. When compared to a non-CAP approach, we find that a CAP ensemble improves calibration by approximately 10% while maintaining similar accuracy to non-CAP alternatives. In addition, our CAP algorithm (i) generalizes past ensemble work associated with influenza forecasting and introduces a framework for future ensemble work, (ii) automatically accounts for missing forecasts from individual models, (iii) allows public health officials to participate in the ensemble by assigning individual models to clusters, and (iv) provide an additional signal about when peak influenza may be near.
期刊介绍:
Epidemics publishes papers on infectious disease dynamics in the broadest sense. Its scope covers both within-host dynamics of infectious agents and dynamics at the population level, particularly the interaction between the two. Areas of emphasis include: spread, transmission, persistence, implications and population dynamics of infectious diseases; population and public health as well as policy aspects of control and prevention; dynamics at the individual level; interaction with the environment, ecology and evolution of infectious diseases, as well as population genetics of infectious agents.