Data-driven projection pursuit adaptation of polynomial chaos expansions for dependent high-dimensional parameters

IF 6.9 1区 工程技术 Q1 ENGINEERING, MULTIDISCIPLINARY
Xiaoshu Zeng, Roger Ghanem
{"title":"Data-driven projection pursuit adaptation of polynomial chaos expansions for dependent high-dimensional parameters","authors":"Xiaoshu Zeng,&nbsp;Roger Ghanem","doi":"10.1016/j.cma.2024.117505","DOIUrl":null,"url":null,"abstract":"<div><div>Uncertainty quantification (UQ) and inference involving a large number of parameters are valuable tools for problems associated with heterogeneous and non-stationary behaviors. The difficulty with these problems is exacerbated when these parameters are statistically dependent requiring statistical characterization over joint measures. Probabilistic modeling methodologies stand as effective tools in the realms of UQ and inference. Among these, polynomial chaos expansions (PCE), when adapted to low-dimensional quantities of interest (QoI), provide effective yet accurate approximations for these QoI in terms of an adapted orthogonal basis. These adaptation techniques have been cast as projection pursuits in Gaussian Hilbert space in what has been referred to as a projection pursuit adaptation (PPA) by Xiaoshu Zeng and Roger Ghanem (2023). The PPA method efficiently identifies an optimal low-dimensional space for representing the QoI and simultaneously evaluates an optimal PCE within that space. The quality of this approximation clearly depends on the size of the training dataset, which is typically a function of the adapted reduced dimension. The complexity of the problem is thus mediated by the complexity of the low-dimensional quantity of interest and not the complexity of the high-dimensional parameter space.</div><div>In this paper, our objective is to tackle the challenge of dependent parameters while constructing the PPA, utilizing a generative data-driven framework that requires a fixed number of pre-evaluated (parameter, QoI) pairs. While PCE approaches dealing with dependent input parameters have already been introduced by Christian Soize and Roger Ghanem (2004) their coupling with basis adaptation remains an outstanding task without which they remain plagued by the curse of dimensionality. For modest-sized parameters, mapping such as the Rosenblatt transformation can be employed to decouple the dependent variables. This strategy requires access to the joint distribution of the random variables which is usually lacking, requiring significantly more data than is typically available. To overcome these limitations, we propose leveraging multivariate Regular Vine (R-vine) copulas to encapsulate the dependency structure within parameters, manifested as a joint cumulative density function (CDF). The Rosenblatt transformation can then be applied to decouple the dependent input data, mapping them to samples from independent Gaussian variables. Conversely, we can generate dependent samples from independent Gaussian variables while maintaining the learned dependencies. This generative capability ensures that the reconstructed dependency structure is faithfully preserved in the generated samples. Endowed with the ability to diagonalize measures on product spaces, the R-vine copula blends seamlessly with the PPA method, resulting in a unified procedure for constructing optimally reduced PCE models tailored for high-dimensional problems with dependent parameter spaces. The proposed methodology attains remarkable accuracy for both UQ and inference. In the latter, the constructed PCE model adeptly serves as a generative and convergent surrogate model for machine learning regression. The efficiency of the proposed methodology is validated through two distinct applications: water flow through a borehole and structural dynamics.</div></div>","PeriodicalId":55222,"journal":{"name":"Computer Methods in Applied Mechanics and Engineering","volume":null,"pages":null},"PeriodicalIF":6.9000,"publicationDate":"2024-11-05","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Computer Methods in Applied Mechanics and Engineering","FirstCategoryId":"5","ListUrlMain":"https://www.sciencedirect.com/science/article/pii/S004578252400759X","RegionNum":1,"RegionCategory":"工程技术","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q1","JCRName":"ENGINEERING, MULTIDISCIPLINARY","Score":null,"Total":0}
引用次数: 0

Abstract

Uncertainty quantification (UQ) and inference involving a large number of parameters are valuable tools for problems associated with heterogeneous and non-stationary behaviors. The difficulty with these problems is exacerbated when these parameters are statistically dependent requiring statistical characterization over joint measures. Probabilistic modeling methodologies stand as effective tools in the realms of UQ and inference. Among these, polynomial chaos expansions (PCE), when adapted to low-dimensional quantities of interest (QoI), provide effective yet accurate approximations for these QoI in terms of an adapted orthogonal basis. These adaptation techniques have been cast as projection pursuits in Gaussian Hilbert space in what has been referred to as a projection pursuit adaptation (PPA) by Xiaoshu Zeng and Roger Ghanem (2023). The PPA method efficiently identifies an optimal low-dimensional space for representing the QoI and simultaneously evaluates an optimal PCE within that space. The quality of this approximation clearly depends on the size of the training dataset, which is typically a function of the adapted reduced dimension. The complexity of the problem is thus mediated by the complexity of the low-dimensional quantity of interest and not the complexity of the high-dimensional parameter space.
In this paper, our objective is to tackle the challenge of dependent parameters while constructing the PPA, utilizing a generative data-driven framework that requires a fixed number of pre-evaluated (parameter, QoI) pairs. While PCE approaches dealing with dependent input parameters have already been introduced by Christian Soize and Roger Ghanem (2004) their coupling with basis adaptation remains an outstanding task without which they remain plagued by the curse of dimensionality. For modest-sized parameters, mapping such as the Rosenblatt transformation can be employed to decouple the dependent variables. This strategy requires access to the joint distribution of the random variables which is usually lacking, requiring significantly more data than is typically available. To overcome these limitations, we propose leveraging multivariate Regular Vine (R-vine) copulas to encapsulate the dependency structure within parameters, manifested as a joint cumulative density function (CDF). The Rosenblatt transformation can then be applied to decouple the dependent input data, mapping them to samples from independent Gaussian variables. Conversely, we can generate dependent samples from independent Gaussian variables while maintaining the learned dependencies. This generative capability ensures that the reconstructed dependency structure is faithfully preserved in the generated samples. Endowed with the ability to diagonalize measures on product spaces, the R-vine copula blends seamlessly with the PPA method, resulting in a unified procedure for constructing optimally reduced PCE models tailored for high-dimensional problems with dependent parameter spaces. The proposed methodology attains remarkable accuracy for both UQ and inference. In the latter, the constructed PCE model adeptly serves as a generative and convergent surrogate model for machine learning regression. The efficiency of the proposed methodology is validated through two distinct applications: water flow through a borehole and structural dynamics.
数据驱动的多项式混沌展开的投影追求适应性,用于依赖性高维参数
不确定性量化(UQ)和涉及大量参数的推理是解决与异质性和非稳态行为相关问题的重要工具。当这些参数具有统计依赖性,需要对联合测量进行统计描述时,这些问题的难度就会加剧。概率建模方法是 UQ 和推理领域的有效工具。其中,多项式混沌展开(PCE)在适用于低维感兴趣量(QoI)时,可根据适应的正交基础为这些感兴趣量提供有效而准确的近似值。曾小树和罗杰-加内姆(2023 年)将这些适配技术视为高斯希尔伯特空间中的投影追求,并将其称为投影追求适配(PPA)。PPA 方法能有效地识别出表示 QoI 的最佳低维空间,并同时评估该空间内的最佳 PCE。这种近似方法的质量显然取决于训练数据集的大小,而训练数据集的大小通常是经过调整的缩减维度的函数。因此,问题的复杂性取决于低维兴趣量的复杂性,而非高维参数空间的复杂性。本文的目标是在构建 PPA 的同时,利用生成数据驱动框架来应对依赖参数的挑战,该框架需要固定数量的预评估(参数、QoI)对。尽管 Christian Soize 和 Roger Ghanem(2004 年)已经提出了处理依赖性输入参数的 PCE 方法,但它们与基础自适应的耦合仍然是一项艰巨的任务,如果没有基础自适应,它们仍然会受到维度诅咒的困扰。对于大小适中的参数,可以采用罗森布拉特变换等映射来解耦因变量。这种策略需要访问随机变量的联合分布,而通常缺乏这种访问,因此需要比现有数据多得多的数据。为了克服这些局限性,我们建议利用多变量正则藤蔓(R-vine)协方差来封装参数中的依赖结构,表现为联合累积密度函数(CDF)。然后,可以应用罗森布拉特(Rosenblatt)变换来解耦依赖性输入数据,将其映射为独立高斯变量的样本。反过来,我们也可以从独立高斯变量生成从属样本,同时保持所学的依赖关系。这种生成能力可确保生成的样本中忠实地保留重建的依赖关系结构。R-vine copula 具有在乘积空间上对角化度量的能力,能与 PPA 方法完美结合,从而形成一个统一的程序,用于构建优化简化的 PCE 模型,以解决具有依赖参数空间的高维问题。所提出的方法在 UQ 和推理方面都达到了显著的精度。在后者中,构建的 PCE 模型可作为机器学习回归的生成和收敛替代模型。所提方法的效率通过两个不同的应用得到了验证:通过钻孔的水流和结构动力学。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
求助全文
约1分钟内获得全文 求助全文
来源期刊
CiteScore
12.70
自引率
15.30%
发文量
719
审稿时长
44 days
期刊介绍: Computer Methods in Applied Mechanics and Engineering stands as a cornerstone in the realm of computational science and engineering. With a history spanning over five decades, the journal has been a key platform for disseminating papers on advanced mathematical modeling and numerical solutions. Interdisciplinary in nature, these contributions encompass mechanics, mathematics, computer science, and various scientific disciplines. The journal welcomes a broad range of computational methods addressing the simulation, analysis, and design of complex physical problems, making it a vital resource for researchers in the field.
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
确定
请完成安全验证×
copy
已复制链接
快去分享给好友吧!
我知道了
右上角分享
点击右上角分享
0
联系我们:info@booksci.cn Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。 Copyright © 2023 布克学术 All rights reserved.
京ICP备2023020795号-1
ghs 京公网安备 11010802042870号
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术官方微信