Rongjie Huang , Christopher McMahan , Brian Herrin , Alexander McLain , Bo Cai , Stella Self
{"title":"梯度提升:马尔科夫链蒙特卡罗抽样的高效计算替代方案,用于拟合大型贝叶斯时空二项式回归模型","authors":"Rongjie Huang , Christopher McMahan , Brian Herrin , Alexander McLain , Bo Cai , Stella Self","doi":"10.1016/j.idm.2024.09.008","DOIUrl":null,"url":null,"abstract":"<div><div>Disease forecasting and surveillance often involve fitting models to a tremendous volume of historical testing data collected over space and time. Bayesian spatio-temporal regression models fit with Markov chain Monte Carlo (MCMC) methods are commonly used for such data. When the spatio-temporal support of the model is large, implementing an MCMC algorithm becomes a significant computational burden. This research proposes a computationally efficient gradient boosting algorithm for fitting a Bayesian spatio-temporal mixed effects binomial regression model. We demonstrate our method on a disease forecasting model and compare it to a computationally optimized MCMC approach. Both methods are used to produce monthly forecasts for Lyme disease, anaplasmosis, ehrlichiosis, and heartworm disease in domestic dogs for the contiguous United States. The data have a spatial support of 3108 counties and a temporal support of 108–138 months with 71–135 million test results. The proposed estimation approach is several orders of magnitude faster than the optimized MCMC algorithm, with a similar mean absolute prediction error.</div></div>","PeriodicalId":36831,"journal":{"name":"Infectious Disease Modelling","volume":null,"pages":null},"PeriodicalIF":8.8000,"publicationDate":"2024-10-04","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":"{\"title\":\"Gradient boosting: A computationally efficient alternative to Markov chain Monte Carlo sampling for fitting large Bayesian spatio-temporal binomial regression models\",\"authors\":\"Rongjie Huang , Christopher McMahan , Brian Herrin , Alexander McLain , Bo Cai , Stella Self\",\"doi\":\"10.1016/j.idm.2024.09.008\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"<div><div>Disease forecasting and surveillance often involve fitting models to a tremendous volume of historical testing data collected over space and time. Bayesian spatio-temporal regression models fit with Markov chain Monte Carlo (MCMC) methods are commonly used for such data. When the spatio-temporal support of the model is large, implementing an MCMC algorithm becomes a significant computational burden. This research proposes a computationally efficient gradient boosting algorithm for fitting a Bayesian spatio-temporal mixed effects binomial regression model. We demonstrate our method on a disease forecasting model and compare it to a computationally optimized MCMC approach. Both methods are used to produce monthly forecasts for Lyme disease, anaplasmosis, ehrlichiosis, and heartworm disease in domestic dogs for the contiguous United States. The data have a spatial support of 3108 counties and a temporal support of 108–138 months with 71–135 million test results. The proposed estimation approach is several orders of magnitude faster than the optimized MCMC algorithm, with a similar mean absolute prediction error.</div></div>\",\"PeriodicalId\":36831,\"journal\":{\"name\":\"Infectious Disease Modelling\",\"volume\":null,\"pages\":null},\"PeriodicalIF\":8.8000,\"publicationDate\":\"2024-10-04\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"0\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"Infectious Disease Modelling\",\"FirstCategoryId\":\"3\",\"ListUrlMain\":\"https://www.sciencedirect.com/science/article/pii/S2468042724001131\",\"RegionNum\":3,\"RegionCategory\":\"医学\",\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"Q1\",\"JCRName\":\"Medicine\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"Infectious Disease Modelling","FirstCategoryId":"3","ListUrlMain":"https://www.sciencedirect.com/science/article/pii/S2468042724001131","RegionNum":3,"RegionCategory":"医学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q1","JCRName":"Medicine","Score":null,"Total":0}
Gradient boosting: A computationally efficient alternative to Markov chain Monte Carlo sampling for fitting large Bayesian spatio-temporal binomial regression models
Disease forecasting and surveillance often involve fitting models to a tremendous volume of historical testing data collected over space and time. Bayesian spatio-temporal regression models fit with Markov chain Monte Carlo (MCMC) methods are commonly used for such data. When the spatio-temporal support of the model is large, implementing an MCMC algorithm becomes a significant computational burden. This research proposes a computationally efficient gradient boosting algorithm for fitting a Bayesian spatio-temporal mixed effects binomial regression model. We demonstrate our method on a disease forecasting model and compare it to a computationally optimized MCMC approach. Both methods are used to produce monthly forecasts for Lyme disease, anaplasmosis, ehrlichiosis, and heartworm disease in domestic dogs for the contiguous United States. The data have a spatial support of 3108 counties and a temporal support of 108–138 months with 71–135 million test results. The proposed estimation approach is several orders of magnitude faster than the optimized MCMC algorithm, with a similar mean absolute prediction error.
期刊介绍:
Infectious Disease Modelling is an open access journal that undergoes peer-review. Its main objective is to facilitate research that combines mathematical modelling, retrieval and analysis of infection disease data, and public health decision support. The journal actively encourages original research that improves this interface, as well as review articles that highlight innovative methodologies relevant to data collection, informatics, and policy making in the field of public health.