Radial Neighbors for Provably Accurate Scalable Approximations of Gaussian Processes.

IF 2.4 2区 数学 Q2 BIOLOGY
Biometrika Pub Date : 2024-12-01 Epub Date: 2024-06-14 DOI:10.1093/biomet/asae029
Yichen Zhu, Michele Peruzzi, Cheng Li, David B Dunson
{"title":"Radial Neighbors for Provably Accurate Scalable Approximations of Gaussian Processes.","authors":"Yichen Zhu, Michele Peruzzi, Cheng Li, David B Dunson","doi":"10.1093/biomet/asae029","DOIUrl":null,"url":null,"abstract":"<p><p>In geostatistical problems with massive sample size, Gaussian processes can be approximated using sparse directed acyclic graphs to achieve scalable <math><mi>O</mi> <mo>(</mo> <mi>n</mi> <mo>)</mo></math> computational complexity. In these models, data at each location are typically assumed conditionally dependent on a small set of parents which usually include a subset of the nearest neighbors. These methodologies often exhibit excellent empirical performance, but the lack of theoretical validation leads to unclear guidance in specifying the underlying graphical model and sensitivity to graph choice. We address these issues by introducing radial neighbors Gaussian processes (RadGP), a class of Gaussian processes based on directed acyclic graphs in which directed edges connect every location to all of its neighbors within a predetermined radius. We prove that any radial neighbors Gaussian process can accurately approximate the corresponding unrestricted Gaussian process in Wasserstein-2 distance, with an error rate determined by the approximation radius, the spatial covariance function, and the spatial dispersion of samples. We offer further empirical validation of our approach via applications on simulated and real world data showing excellent performance in both prior and posterior approximations to the original Gaussian process.</p>","PeriodicalId":9001,"journal":{"name":"Biometrika","volume":"111 4","pages":"1151-1167"},"PeriodicalIF":2.4000,"publicationDate":"2024-12-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC11993192/pdf/","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Biometrika","FirstCategoryId":"100","ListUrlMain":"https://doi.org/10.1093/biomet/asae029","RegionNum":2,"RegionCategory":"数学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"2024/6/14 0:00:00","PubModel":"Epub","JCR":"Q2","JCRName":"BIOLOGY","Score":null,"Total":0}
引用次数: 0

Abstract

In geostatistical problems with massive sample size, Gaussian processes can be approximated using sparse directed acyclic graphs to achieve scalable O ( n ) computational complexity. In these models, data at each location are typically assumed conditionally dependent on a small set of parents which usually include a subset of the nearest neighbors. These methodologies often exhibit excellent empirical performance, but the lack of theoretical validation leads to unclear guidance in specifying the underlying graphical model and sensitivity to graph choice. We address these issues by introducing radial neighbors Gaussian processes (RadGP), a class of Gaussian processes based on directed acyclic graphs in which directed edges connect every location to all of its neighbors within a predetermined radius. We prove that any radial neighbors Gaussian process can accurately approximate the corresponding unrestricted Gaussian process in Wasserstein-2 distance, with an error rate determined by the approximation radius, the spatial covariance function, and the spatial dispersion of samples. We offer further empirical validation of our approach via applications on simulated and real world data showing excellent performance in both prior and posterior approximations to the original Gaussian process.

高斯过程可证明精确可扩展逼近的径向邻域。
在具有大量样本量的地统计问题中,高斯过程可以用稀疏有向无环图逼近,以达到可扩展的O (n)计算复杂度。在这些模型中,通常假设每个位置的数据有条件地依赖于一组父节点,这些父节点通常包括最近邻居的子集。这些方法通常表现出出色的经验表现,但缺乏理论验证导致在指定底层图形模型和对图形选择的敏感性方面指导不明确。我们通过引入径向邻居高斯过程(RadGP)来解决这些问题,RadGP是一类基于有向无环图的高斯过程,其中有向边将每个位置连接到预定半径内的所有邻居。我们证明了任意径向邻近高斯过程都能在Wasserstein-2距离上精确地逼近相应的不受限制高斯过程,其误差率由近似半径、空间协方差函数和样本的空间色散决定。我们通过模拟和真实世界数据的应用对我们的方法进行了进一步的经验验证,这些数据在原始高斯过程的先验和后验近似中都显示出优异的性能。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
求助全文
约1分钟内获得全文 求助全文
来源期刊
Biometrika
Biometrika 生物-生物学
CiteScore
5.50
自引率
3.70%
发文量
56
审稿时长
6-12 weeks
期刊介绍: Biometrika is primarily a journal of statistics in which emphasis is placed on papers containing original theoretical contributions of direct or potential value in applications. From time to time, papers in bordering fields are also published.
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
确定
请完成安全验证×
copy
已复制链接
快去分享给好友吧!
我知道了
右上角分享
点击右上角分享
0
联系我们:info@booksci.cn Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。 Copyright © 2023 布克学术 All rights reserved.
京ICP备2023020795号-1
ghs 京公网安备 11010802042870号
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术官方微信