Christopher Irwin , Alexandros A. Taflanidis , Norberto C. Nadal-Caraballo , Luke A. Aucoin , Madison C. Yawn
{"title":"大数据集自适应协方差渐减及其在风暴潮空间插值中的应用","authors":"Christopher Irwin , Alexandros A. Taflanidis , Norberto C. Nadal-Caraballo , Luke A. Aucoin , Madison C. Yawn","doi":"10.1016/j.coastaleng.2025.104768","DOIUrl":null,"url":null,"abstract":"<div><div>Covariance tapering is a popular approach for accommodating computational efficiency for the application of Gaussian process (GP) -based spatial interpolation for large datasets. This is accomplished by introducing sparsity in the <span>GP</span> covariance matrix, through the introduction of a compactly supported taper function. The support of the taper function around each spatial node is defined through the taper range variable. The latter is selected to achieve the desired degree of global sparsity in the covariance matrix, and defines the number of connected neighbors (i.e., local sparsity) around each node. For problems with irregular nodal density, adaptive covariance tapering can be used to improve accuracy for the taper implementation. In this case, the taper ranges of the taper function have spatial variability, allowing uniform local sparsity to be achieved despite the data irregularities. The optimization of the taper ranges to accomplish this objective has a computational burden that is dependent on the size of the database, prohibiting its application to very large datasets. This paper formally considers the adoption of adaptive covariance tapers for such datasets. Though algorithmic developments are general, the problem is discussed for a specific application, the spatial interpolation of storm surge. For establishing computational efficiency in the optimization of the taper ranges we propose to utilize only a small subset of nodes, termed inducing points. An adaptive, iterative formulation is further developed to support the selection of the inducing points, shown to be critical for achieving the desired local sparsity for the remaining points. At each iteration, the taper range selection is performed using the current subset of inducing points, the achieved sparsity across all nodes is estimated, and then new inducing points are added within the sub-regions for which the discrepancy from the target local sparsity is the largest. The latter points are considered to have the highest expected utility as inducing points. Adding inducing points in close proximity is avoided through the inclusion of a clustering step. The implementation is demonstrated for interpolation of peak storm surge along the New Jersey coast, using two different domains, one with 64,379 nodes and one with 271,669 nodes.</div></div>","PeriodicalId":50996,"journal":{"name":"Coastal Engineering","volume":"201 ","pages":"Article 104768"},"PeriodicalIF":4.5000,"publicationDate":"2025-04-29","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":"{\"title\":\"Adaptive covariance tapering for large datasets and application to spatial interpolation of storm surge\",\"authors\":\"Christopher Irwin , Alexandros A. Taflanidis , Norberto C. Nadal-Caraballo , Luke A. Aucoin , Madison C. Yawn\",\"doi\":\"10.1016/j.coastaleng.2025.104768\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"<div><div>Covariance tapering is a popular approach for accommodating computational efficiency for the application of Gaussian process (GP) -based spatial interpolation for large datasets. This is accomplished by introducing sparsity in the <span>GP</span> covariance matrix, through the introduction of a compactly supported taper function. The support of the taper function around each spatial node is defined through the taper range variable. The latter is selected to achieve the desired degree of global sparsity in the covariance matrix, and defines the number of connected neighbors (i.e., local sparsity) around each node. For problems with irregular nodal density, adaptive covariance tapering can be used to improve accuracy for the taper implementation. In this case, the taper ranges of the taper function have spatial variability, allowing uniform local sparsity to be achieved despite the data irregularities. The optimization of the taper ranges to accomplish this objective has a computational burden that is dependent on the size of the database, prohibiting its application to very large datasets. This paper formally considers the adoption of adaptive covariance tapers for such datasets. Though algorithmic developments are general, the problem is discussed for a specific application, the spatial interpolation of storm surge. For establishing computational efficiency in the optimization of the taper ranges we propose to utilize only a small subset of nodes, termed inducing points. An adaptive, iterative formulation is further developed to support the selection of the inducing points, shown to be critical for achieving the desired local sparsity for the remaining points. At each iteration, the taper range selection is performed using the current subset of inducing points, the achieved sparsity across all nodes is estimated, and then new inducing points are added within the sub-regions for which the discrepancy from the target local sparsity is the largest. The latter points are considered to have the highest expected utility as inducing points. Adding inducing points in close proximity is avoided through the inclusion of a clustering step. The implementation is demonstrated for interpolation of peak storm surge along the New Jersey coast, using two different domains, one with 64,379 nodes and one with 271,669 nodes.</div></div>\",\"PeriodicalId\":50996,\"journal\":{\"name\":\"Coastal Engineering\",\"volume\":\"201 \",\"pages\":\"Article 104768\"},\"PeriodicalIF\":4.5000,\"publicationDate\":\"2025-04-29\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"0\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"Coastal Engineering\",\"FirstCategoryId\":\"5\",\"ListUrlMain\":\"https://www.sciencedirect.com/science/article/pii/S0378383925000730\",\"RegionNum\":2,\"RegionCategory\":\"工程技术\",\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"Q1\",\"JCRName\":\"ENGINEERING, CIVIL\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"Coastal Engineering","FirstCategoryId":"5","ListUrlMain":"https://www.sciencedirect.com/science/article/pii/S0378383925000730","RegionNum":2,"RegionCategory":"工程技术","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q1","JCRName":"ENGINEERING, CIVIL","Score":null,"Total":0}
Adaptive covariance tapering for large datasets and application to spatial interpolation of storm surge
Covariance tapering is a popular approach for accommodating computational efficiency for the application of Gaussian process (GP) -based spatial interpolation for large datasets. This is accomplished by introducing sparsity in the GP covariance matrix, through the introduction of a compactly supported taper function. The support of the taper function around each spatial node is defined through the taper range variable. The latter is selected to achieve the desired degree of global sparsity in the covariance matrix, and defines the number of connected neighbors (i.e., local sparsity) around each node. For problems with irregular nodal density, adaptive covariance tapering can be used to improve accuracy for the taper implementation. In this case, the taper ranges of the taper function have spatial variability, allowing uniform local sparsity to be achieved despite the data irregularities. The optimization of the taper ranges to accomplish this objective has a computational burden that is dependent on the size of the database, prohibiting its application to very large datasets. This paper formally considers the adoption of adaptive covariance tapers for such datasets. Though algorithmic developments are general, the problem is discussed for a specific application, the spatial interpolation of storm surge. For establishing computational efficiency in the optimization of the taper ranges we propose to utilize only a small subset of nodes, termed inducing points. An adaptive, iterative formulation is further developed to support the selection of the inducing points, shown to be critical for achieving the desired local sparsity for the remaining points. At each iteration, the taper range selection is performed using the current subset of inducing points, the achieved sparsity across all nodes is estimated, and then new inducing points are added within the sub-regions for which the discrepancy from the target local sparsity is the largest. The latter points are considered to have the highest expected utility as inducing points. Adding inducing points in close proximity is avoided through the inclusion of a clustering step. The implementation is demonstrated for interpolation of peak storm surge along the New Jersey coast, using two different domains, one with 64,379 nodes and one with 271,669 nodes.
期刊介绍:
Coastal Engineering is an international medium for coastal engineers and scientists. Combining practical applications with modern technological and scientific approaches, such as mathematical and numerical modelling, laboratory and field observations and experiments, it publishes fundamental studies as well as case studies on the following aspects of coastal, harbour and offshore engineering: waves, currents and sediment transport; coastal, estuarine and offshore morphology; technical and functional design of coastal and harbour structures; morphological and environmental impact of coastal, harbour and offshore structures.