Data-driven projection pursuit adaptation of polynomial chaos expansions for dependent high-dimensional parameters

IF 6.9 1区工程技术 Q1 ENGINEERING, MULTIDISCIPLINARY

Computer Methods in Applied Mechanics and Engineering Pub Date : 2024-11-05 DOI:10.1016/j.cma.2024.117505

Xiaoshu Zeng, Roger Ghanem

{"title":"Data-driven projection pursuit adaptation of polynomial chaos expansions for dependent high-dimensional parameters","authors":"Xiaoshu Zeng, Roger Ghanem","doi":"10.1016/j.cma.2024.117505","DOIUrl":null,"url":null,"abstract":"<div><div>Uncertainty quantification (UQ) and inference involving a large number of parameters are valuable tools for problems associated with heterogeneous and non-stationary behaviors. The difficulty with these problems is exacerbated when these parameters are statistically dependent requiring statistical characterization over joint measures. Probabilistic modeling methodologies stand as effective tools in the realms of UQ and inference. Among these, polynomial chaos expansions (PCE), when adapted to low-dimensional quantities of interest (QoI), provide effective yet accurate approximations for these QoI in terms of an adapted orthogonal basis. These adaptation techniques have been cast as projection pursuits in Gaussian Hilbert space in what has been referred to as a projection pursuit adaptation (PPA) by Xiaoshu Zeng and Roger Ghanem (2023). The PPA method efficiently identifies an optimal low-dimensional space for representing the QoI and simultaneously evaluates an optimal PCE within that space. The quality of this approximation clearly depends on the size of the training dataset, which is typically a function of the adapted reduced dimension. The complexity of the problem is thus mediated by the complexity of the low-dimensional quantity of interest and not the complexity of the high-dimensional parameter space.</div><div>In this paper, our objective is to tackle the challenge of dependent parameters while constructing the PPA, utilizing a generative data-driven framework that requires a fixed number of pre-evaluated (parameter, QoI) pairs. While PCE approaches dealing with dependent input parameters have already been introduced by Christian Soize and Roger Ghanem (2004) their coupling with basis adaptation remains an outstanding task without which they remain plagued by the curse of dimensionality. For modest-sized parameters, mapping such as the Rosenblatt transformation can be employed to decouple the dependent variables. This strategy requires access to the joint distribution of the random variables which is usually lacking, requiring significantly more data than is typically available. To overcome these limitations, we propose leveraging multivariate Regular Vine (R-vine) copulas to encapsulate the dependency structure within parameters, manifested as a joint cumulative density function (CDF). The Rosenblatt transformation can then be applied to decouple the dependent input data, mapping them to samples from independent Gaussian variables. Conversely, we can generate dependent samples from independent Gaussian variables while maintaining the learned dependencies. This generative capability ensures that the reconstructed dependency structure is faithfully preserved in the generated samples. Endowed with the ability to diagonalize measures on product spaces, the R-vine copula blends seamlessly with the PPA method, resulting in a unified procedure for constructing optimally reduced PCE models tailored for high-dimensional problems with dependent parameter spaces. The proposed methodology attains remarkable accuracy for both UQ and inference. In the latter, the constructed PCE model adeptly serves as a generative and convergent surrogate model for machine learning regression. The efficiency of the proposed methodology is validated through two distinct applications: water flow through a borehole and structural dynamics.</div></div>","PeriodicalId":55222,"journal":{"name":"Computer Methods in Applied Mechanics and Engineering","volume":"433 ","pages":"Article 117505"},"PeriodicalIF":6.9000,"publicationDate":"2024-11-05","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Computer Methods in Applied Mechanics and Engineering","FirstCategoryId":"5","ListUrlMain":"https://www.sciencedirect.com/science/article/pii/S004578252400759X","RegionNum":1,"RegionCategory":"工程技术","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q1","JCRName":"ENGINEERING, MULTIDISCIPLINARY","Score":null,"Total":0}

引用次数: 0

Abstract

Uncertainty quantification (UQ) and inference involving a large number of parameters are valuable tools for problems associated with heterogeneous and non-stationary behaviors. The difficulty with these problems is exacerbated when these parameters are statistically dependent requiring statistical characterization over joint measures. Probabilistic modeling methodologies stand as effective tools in the realms of UQ and inference. Among these, polynomial chaos expansions (PCE), when adapted to low-dimensional quantities of interest (QoI), provide effective yet accurate approximations for these QoI in terms of an adapted orthogonal basis. These adaptation techniques have been cast as projection pursuits in Gaussian Hilbert space in what has been referred to as a projection pursuit adaptation (PPA) by Xiaoshu Zeng and Roger Ghanem (2023). The PPA method efficiently identifies an optimal low-dimensional space for representing the QoI and simultaneously evaluates an optimal PCE within that space. The quality of this approximation clearly depends on the size of the training dataset, which is typically a function of the adapted reduced dimension. The complexity of the problem is thus mediated by the complexity of the low-dimensional quantity of interest and not the complexity of the high-dimensional parameter space.

In this paper, our objective is to tackle the challenge of dependent parameters while constructing the PPA, utilizing a generative data-driven framework that requires a fixed number of pre-evaluated (parameter, QoI) pairs. While PCE approaches dealing with dependent input parameters have already been introduced by Christian Soize and Roger Ghanem (2004) their coupling with basis adaptation remains an outstanding task without which they remain plagued by the curse of dimensionality. For modest-sized parameters, mapping such as the Rosenblatt transformation can be employed to decouple the dependent variables. This strategy requires access to the joint distribution of the random variables which is usually lacking, requiring significantly more data than is typically available. To overcome these limitations, we propose leveraging multivariate Regular Vine (R-vine) copulas to encapsulate the dependency structure within parameters, manifested as a joint cumulative density function (CDF). The Rosenblatt transformation can then be applied to decouple the dependent input data, mapping them to samples from independent Gaussian variables. Conversely, we can generate dependent samples from independent Gaussian variables while maintaining the learned dependencies. This generative capability ensures that the reconstructed dependency structure is faithfully preserved in the generated samples. Endowed with the ability to diagonalize measures on product spaces, the R-vine copula blends seamlessly with the PPA method, resulting in a unified procedure for constructing optimally reduced PCE models tailored for high-dimensional problems with dependent parameter spaces. The proposed methodology attains remarkable accuracy for both UQ and inference. In the latter, the constructed PCE model adeptly serves as a generative and convergent surrogate model for machine learning regression. The efficiency of the proposed methodology is validated through two distinct applications: water flow through a borehole and structural dynamics.

查看原文本刊更多论文

数据驱动的多项式混沌展开的投影追求适应性，用于依赖性高维参数

不确定性量化（UQ）和涉及大量参数的推理是解决与异质性和非稳态行为相关问题的重要工具。当这些参数具有统计依赖性，需要对联合测量进行统计描述时，这些问题的难度就会加剧。概率建模方法是 UQ 和推理领域的有效工具。其中，多项式混沌展开（PCE）在适用于低维感兴趣量（QoI）时，可根据适应的正交基础为这些感兴趣量提供有效而准确的近似值。曾小树和罗杰-加内姆（2023 年）将这些适配技术视为高斯希尔伯特空间中的投影追求，并将其称为投影追求适配（PPA）。PPA 方法能有效地识别出表示 QoI 的最佳低维空间，并同时评估该空间内的最佳 PCE。这种近似方法的质量显然取决于训练数据集的大小，而训练数据集的大小通常是经过调整的缩减维度的函数。因此，问题的复杂性取决于低维兴趣量的复杂性，而非高维参数空间的复杂性。本文的目标是在构建 PPA 的同时，利用生成数据驱动框架来应对依赖参数的挑战，该框架需要固定数量的预评估（参数、QoI）对。尽管 Christian Soize 和 Roger Ghanem（2004 年）已经提出了处理依赖性输入参数的 PCE 方法，但它们与基础自适应的耦合仍然是一项艰巨的任务，如果没有基础自适应，它们仍然会受到维度诅咒的困扰。对于大小适中的参数，可以采用罗森布拉特变换等映射来解耦因变量。这种策略需要访问随机变量的联合分布，而通常缺乏这种访问，因此需要比现有数据多得多的数据。为了克服这些局限性，我们建议利用多变量正则藤蔓（R-vine）协方差来封装参数中的依赖结构，表现为联合累积密度函数（CDF）。然后，可以应用罗森布拉特（Rosenblatt）变换来解耦依赖性输入数据，将其映射为独立高斯变量的样本。反过来，我们也可以从独立高斯变量生成从属样本，同时保持所学的依赖关系。这种生成能力可确保生成的样本中忠实地保留重建的依赖关系结构。R-vine copula 具有在乘积空间上对角化度量的能力，能与 PPA 方法完美结合，从而形成一个统一的程序，用于构建优化简化的 PCE 模型，以解决具有依赖参数空间的高维问题。所提出的方法在 UQ 和推理方面都达到了显著的精度。在后者中，构建的 PCE 模型可作为机器学习回归的生成和收敛替代模型。所提方法的效率通过两个不同的应用得到了验证：通过钻孔的水流和结构动力学。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

求助全文

约1分钟内获得全文求助全文

来源期刊

Computer Methods in Applied Mechanics and Engineering 工程技术-工程：综合

CiteScore

12.70

自引率

15.30%

发文量

719

审稿时长

44 days

期刊介绍： Computer Methods in Applied Mechanics and Engineering stands as a cornerstone in the realm of computational science and engineering. With a history spanning over five decades, the journal has been a key platform for disseminating papers on advanced mathematical modeling and numerical solutions. Interdisciplinary in nature, these contributions encompass mechanics, mathematics, computer science, and various scientific disciplines. The journal welcomes a broad range of computational methods addressing the simulation, analysis, and design of complex physical problems, making it a vital resource for researchers in the field.