{"title":"BARTSIMP: Flexible spatial covariate modeling and prediction using Bayesian Additive Regression Trees","authors":"Alex Ziyu Jiang , Jon Wakefield","doi":"10.1016/j.sste.2025.100757","DOIUrl":null,"url":null,"abstract":"<div><div>Prediction is a classic challenge in spatial statistics and the inclusion of spatial covariates can greatly improve predictive performance when incorporated into a model with latent spatial effects. It is desirable to develop flexible regression models that allow for nonlinearities and interactions in the covariate specification. Existing machine learning approaches that allow for spatial dependence in the residuals fail to provide reliable uncertainty estimates. In this paper, we investigate the combination of a Gaussian process spatial model with a Bayesian Additive Regression Tree (BART) model. The computational burden of the approach is reduced by combining Markov chain Monte Carlo (MCMC) with the Integrated Nested Laplace Approximation (INLA) technique. We study the performance of the method first via simulation. We then use the model to predict anthropometric responses in Kenya, with the data collected via a complex sampling design. In particular, household survey data are collected via stratified two-stage unequal probability cluster sampling, which requires special care when modeled.</div></div>","PeriodicalId":46645,"journal":{"name":"Spatial and Spatio-Temporal Epidemiology","volume":"55 ","pages":"Article 100757"},"PeriodicalIF":1.7000,"publicationDate":"2025-10-08","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Spatial and Spatio-Temporal Epidemiology","FirstCategoryId":"1085","ListUrlMain":"https://www.sciencedirect.com/science/article/pii/S1877584525000486","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q3","JCRName":"PUBLIC, ENVIRONMENTAL & OCCUPATIONAL HEALTH","Score":null,"Total":0}
引用次数: 0
Abstract
Prediction is a classic challenge in spatial statistics and the inclusion of spatial covariates can greatly improve predictive performance when incorporated into a model with latent spatial effects. It is desirable to develop flexible regression models that allow for nonlinearities and interactions in the covariate specification. Existing machine learning approaches that allow for spatial dependence in the residuals fail to provide reliable uncertainty estimates. In this paper, we investigate the combination of a Gaussian process spatial model with a Bayesian Additive Regression Tree (BART) model. The computational burden of the approach is reduced by combining Markov chain Monte Carlo (MCMC) with the Integrated Nested Laplace Approximation (INLA) technique. We study the performance of the method first via simulation. We then use the model to predict anthropometric responses in Kenya, with the data collected via a complex sampling design. In particular, household survey data are collected via stratified two-stage unequal probability cluster sampling, which requires special care when modeled.