Adaptive covariance tapering for large datasets and application to spatial interpolation of storm surge

IF 4.5 2区 工程技术 Q1 ENGINEERING, CIVIL
Christopher Irwin , Alexandros A. Taflanidis , Norberto C. Nadal-Caraballo , Luke A. Aucoin , Madison C. Yawn
{"title":"Adaptive covariance tapering for large datasets and application to spatial interpolation of storm surge","authors":"Christopher Irwin ,&nbsp;Alexandros A. Taflanidis ,&nbsp;Norberto C. Nadal-Caraballo ,&nbsp;Luke A. Aucoin ,&nbsp;Madison C. Yawn","doi":"10.1016/j.coastaleng.2025.104768","DOIUrl":null,"url":null,"abstract":"<div><div>Covariance tapering is a popular approach for accommodating computational efficiency for the application of Gaussian process (GP) -based spatial interpolation for large datasets. This is accomplished by introducing sparsity in the <span>GP</span> covariance matrix, through the introduction of a compactly supported taper function. The support of the taper function around each spatial node is defined through the taper range variable. The latter is selected to achieve the desired degree of global sparsity in the covariance matrix, and defines the number of connected neighbors (i.e., local sparsity) around each node. For problems with irregular nodal density, adaptive covariance tapering can be used to improve accuracy for the taper implementation. In this case, the taper ranges of the taper function have spatial variability, allowing uniform local sparsity to be achieved despite the data irregularities. The optimization of the taper ranges to accomplish this objective has a computational burden that is dependent on the size of the database, prohibiting its application to very large datasets. This paper formally considers the adoption of adaptive covariance tapers for such datasets. Though algorithmic developments are general, the problem is discussed for a specific application, the spatial interpolation of storm surge. For establishing computational efficiency in the optimization of the taper ranges we propose to utilize only a small subset of nodes, termed inducing points. An adaptive, iterative formulation is further developed to support the selection of the inducing points, shown to be critical for achieving the desired local sparsity for the remaining points. At each iteration, the taper range selection is performed using the current subset of inducing points, the achieved sparsity across all nodes is estimated, and then new inducing points are added within the sub-regions for which the discrepancy from the target local sparsity is the largest. The latter points are considered to have the highest expected utility as inducing points. Adding inducing points in close proximity is avoided through the inclusion of a clustering step. The implementation is demonstrated for interpolation of peak storm surge along the New Jersey coast, using two different domains, one with 64,379 nodes and one with 271,669 nodes.</div></div>","PeriodicalId":50996,"journal":{"name":"Coastal Engineering","volume":"201 ","pages":"Article 104768"},"PeriodicalIF":4.5000,"publicationDate":"2025-04-29","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Coastal Engineering","FirstCategoryId":"5","ListUrlMain":"https://www.sciencedirect.com/science/article/pii/S0378383925000730","RegionNum":2,"RegionCategory":"工程技术","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q1","JCRName":"ENGINEERING, CIVIL","Score":null,"Total":0}
引用次数: 0

Abstract

Covariance tapering is a popular approach for accommodating computational efficiency for the application of Gaussian process (GP) -based spatial interpolation for large datasets. This is accomplished by introducing sparsity in the GP covariance matrix, through the introduction of a compactly supported taper function. The support of the taper function around each spatial node is defined through the taper range variable. The latter is selected to achieve the desired degree of global sparsity in the covariance matrix, and defines the number of connected neighbors (i.e., local sparsity) around each node. For problems with irregular nodal density, adaptive covariance tapering can be used to improve accuracy for the taper implementation. In this case, the taper ranges of the taper function have spatial variability, allowing uniform local sparsity to be achieved despite the data irregularities. The optimization of the taper ranges to accomplish this objective has a computational burden that is dependent on the size of the database, prohibiting its application to very large datasets. This paper formally considers the adoption of adaptive covariance tapers for such datasets. Though algorithmic developments are general, the problem is discussed for a specific application, the spatial interpolation of storm surge. For establishing computational efficiency in the optimization of the taper ranges we propose to utilize only a small subset of nodes, termed inducing points. An adaptive, iterative formulation is further developed to support the selection of the inducing points, shown to be critical for achieving the desired local sparsity for the remaining points. At each iteration, the taper range selection is performed using the current subset of inducing points, the achieved sparsity across all nodes is estimated, and then new inducing points are added within the sub-regions for which the discrepancy from the target local sparsity is the largest. The latter points are considered to have the highest expected utility as inducing points. Adding inducing points in close proximity is avoided through the inclusion of a clustering step. The implementation is demonstrated for interpolation of peak storm surge along the New Jersey coast, using two different domains, one with 64,379 nodes and one with 271,669 nodes.
大数据集自适应协方差渐减及其在风暴潮空间插值中的应用
协方差逐渐变细是一种流行的方法,用于适应基于高斯过程(GP)的大数据集空间插值应用的计算效率。这是通过在GP协方差矩阵中引入稀疏性,通过引入紧支持的锥度函数来实现的。每个空间节点周围的锥度函数的支持是通过锥度范围变量来定义的。选择后者是为了在协方差矩阵中达到所需的全局稀疏度,并定义每个节点周围连接的邻居的数量(即局部稀疏度)。对于节点密度不规则的问题,可以采用自适应协方差渐变来提高渐变实现的精度。在这种情况下,锥度函数的锥度范围具有空间变异性,允许在数据不规则的情况下实现均匀的局部稀疏性。为实现这一目标而优化锥度范围的计算负担取决于数据库的大小,因此无法将其应用于非常大的数据集。本文正式考虑采用自适应协方差锥对这类数据集。虽然算法的发展是一般的,但讨论了一个具体的应用问题,即风暴潮的空间插值。为了建立锥度范围优化的计算效率,我们建议只利用一小部分节点,称为诱导点。一个自适应的,迭代的公式进一步发展,以支持诱导点的选择,证明是实现所需的局部稀疏性的剩余点的关键。在每次迭代中,使用当前的诱导点子集进行锥度范围选择,估计所有节点上实现的稀疏度,然后在与目标局部稀疏度差异最大的子区域内添加新的诱导点。后一个点被认为具有最高的期望效用作为诱导点。通过包含聚类步骤,避免了在接近中添加诱导点。实现演示了沿新泽西海岸的峰值风暴潮插值,使用两个不同的域,一个有64,379个节点,另一个有271,669个节点。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
求助全文
约1分钟内获得全文 求助全文
来源期刊
Coastal Engineering
Coastal Engineering 工程技术-工程:大洋
CiteScore
9.20
自引率
13.60%
发文量
0
审稿时长
3.5 months
期刊介绍: Coastal Engineering is an international medium for coastal engineers and scientists. Combining practical applications with modern technological and scientific approaches, such as mathematical and numerical modelling, laboratory and field observations and experiments, it publishes fundamental studies as well as case studies on the following aspects of coastal, harbour and offshore engineering: waves, currents and sediment transport; coastal, estuarine and offshore morphology; technical and functional design of coastal and harbour structures; morphological and environmental impact of coastal, harbour and offshore structures.
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
确定
请完成安全验证×
copy
已复制链接
快去分享给好友吧!
我知道了
右上角分享
点击右上角分享
0
联系我们:info@booksci.cn Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。 Copyright © 2023 布克学术 All rights reserved.
京ICP备2023020795号-1
ghs 京公网安备 11010802042870号
Book学术文献互助
Book学术文献互助群
群 号:604180095
Book学术官方微信