Spatial meshing for general Bayesian multivariate models.

IF 5.2 3区计算机科学 Q1 AUTOMATION & CONTROL SYSTEMS

Journal of Machine Learning Research Pub Date : 2024-03-01

Michele Peruzzi, David B Dunson

{"title":"Spatial meshing for general Bayesian multivariate models.","authors":"Michele Peruzzi, David B Dunson","doi":"","DOIUrl":null,"url":null,"abstract":"<p><p>Quantifying spatial and/or temporal associations in multivariate geolocated data of different types is achievable via spatial random effects in a Bayesian hierarchical model, but severe computational bottlenecks arise when spatial dependence is encoded as a latent Gaussian process (GP) in the increasingly common large scale data settings on which we focus. The scenario worsens in non-Gaussian models because the reduced analytical tractability leads to additional hurdles to computational efficiency. In this article, we introduce Bayesian models of spatially referenced data in which the likelihood or the latent process (or both) are not Gaussian. First, we exploit the advantages of spatial processes built via directed acyclic graphs, in which case the spatial nodes enter the Bayesian hierarchy and lead to posterior sampling via routine Markov chain Monte Carlo (MCMC) methods. Second, motivated by the possible inefficiencies of popular gradient-based sampling approaches in the multivariate contexts on which we focus, we introduce the simplified manifold preconditioner adaptation (SiMPA) algorithm which uses second order information about the target but avoids expensive matrix operations. We demostrate the performance and efficiency improvements of our methods relative to alternatives in extensive synthetic and real world remote sensing and community ecology applications with large scale data at up to hundreds of thousands of spatial locations and up to tens of outcomes. Software for the proposed methods is part of R package meshed, available on CRAN.</p>","PeriodicalId":50161,"journal":{"name":"Journal of Machine Learning Research","volume":"25 ","pages":""},"PeriodicalIF":5.2000,"publicationDate":"2024-03-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC12237421/pdf/","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Journal of Machine Learning Research","FirstCategoryId":"94","ListUrlMain":"","RegionNum":3,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q1","JCRName":"AUTOMATION & CONTROL SYSTEMS","Score":null,"Total":0}

引用次数: 0

Abstract

Quantifying spatial and/or temporal associations in multivariate geolocated data of different types is achievable via spatial random effects in a Bayesian hierarchical model, but severe computational bottlenecks arise when spatial dependence is encoded as a latent Gaussian process (GP) in the increasingly common large scale data settings on which we focus. The scenario worsens in non-Gaussian models because the reduced analytical tractability leads to additional hurdles to computational efficiency. In this article, we introduce Bayesian models of spatially referenced data in which the likelihood or the latent process (or both) are not Gaussian. First, we exploit the advantages of spatial processes built via directed acyclic graphs, in which case the spatial nodes enter the Bayesian hierarchy and lead to posterior sampling via routine Markov chain Monte Carlo (MCMC) methods. Second, motivated by the possible inefficiencies of popular gradient-based sampling approaches in the multivariate contexts on which we focus, we introduce the simplified manifold preconditioner adaptation (SiMPA) algorithm which uses second order information about the target but avoids expensive matrix operations. We demostrate the performance and efficiency improvements of our methods relative to alternatives in extensive synthetic and real world remote sensing and community ecology applications with large scale data at up to hundreds of thousands of spatial locations and up to tens of outcomes. Software for the proposed methods is part of R package meshed, available on CRAN.

本刊更多论文

一般贝叶斯多元模型的空间网格划分。

通过贝叶斯层次模型中的空间随机效应，可以对不同类型的多变量地理定位数据中的空间和/或时间关联进行量化，但是在我们关注的日益常见的大规模数据设置中，当空间依赖性被编码为潜在高斯过程（GP）时，会出现严重的计算瓶颈。这种情况在非高斯模型中更糟，因为降低的分析可追溯性导致了计算效率的额外障碍。在本文中，我们介绍了空间参考数据的贝叶斯模型，其中似然或潜在过程（或两者）不是高斯的。首先，我们利用由有向无环图构建的空间过程的优势，在这种情况下，空间节点进入贝叶斯层次，并通过常规的马尔可夫链蒙特卡罗（MCMC）方法导致后验采样。其次，考虑到我们所关注的基于梯度的流行采样方法在多变量环境下可能存在的低效率，我们引入了简化的流形预调节器自适应（SiMPA）算法，该算法使用目标的二阶信息，但避免了昂贵的矩阵运算。我们展示了我们的方法相对于广泛的合成和现实世界遥感和社区生态应用中的替代方法的性能和效率改进，这些方法具有多达数十万个空间位置和多达数十个结果的大规模数据。所提出的方法的软件是R包网格的一部分，可以在CRAN上获得。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

求助全文

约1分钟内获得全文求助全文

来源期刊

Journal of Machine Learning Research 工程技术-计算机：人工智能

CiteScore

18.80

自引率

0.00%

发文量

审稿时长

3 months

期刊介绍： The Journal of Machine Learning Research (JMLR) provides an international forum for the electronic and paper publication of high-quality scholarly articles in all areas of machine learning. All published papers are freely available online. JMLR has a commitment to rigorous yet rapid reviewing. JMLR seeks previously unpublished papers on machine learning that contain: new principled algorithms with sound empirical validation, and with justification of theoretical, psychological, or biological nature; experimental and/or theoretical studies yielding new insight into the design and behavior of learning in intelligent systems; accounts of applications of existing techniques that shed light on the strengths and weaknesses of the methods; formalization of new learning tasks (e.g., in the context of new applications) and of methods for assessing performance on those tasks; development of new analytical frameworks that advance theoretical studies of practical learning methods; computational models of data from natural learning systems at the behavioral or neural level; or extremely well-written surveys of existing work.