{"title":"Novel approaches for hyper-parameter tuning of physics-informed Gaussian processes: application to parametric PDEs","authors":"Masoud Ezati, Mohsen Esmaeilbeigi, Ahmad Kamandi","doi":"10.1007/s00366-024-01970-8","DOIUrl":null,"url":null,"abstract":"<p>Today, Physics-informed machine learning (PIML) methods are one of the effective tools with high flexibility for solving inverse problems and operational equations. Among these methods, physics-informed learning model built upon Gaussian processes (PIGP) has a special place due to provide the posterior probabilistic distribution of their predictions in the context of Bayesian inference. In this method, the training phase to determine the optimal hyper parameters is equivalent to the optimization of a non-convex function called the likelihood function. Due to access the explicit form of the gradient, it is recommended to use conjugate gradient (CG) optimization algorithms. In addition, due to the necessity of computation of the determinant and inverse of the covariance matrix in each evaluation of the likelihood function, it is recommended to use CG methods in such a way that it can be completed in the minimum number of evaluations. In previous studies, only special form of CG method has been considered, which naturally will not have high efficiency. In this paper, the efficiency of the CG methods for optimization of the likelihood function in PIGP has been studied. The results of the numerical simulations show that the initial step length and search direction in CG methods have a significant effect on the number of evaluations of the likelihood function and consequently on the efficiency of the PIGP. Also, according to the specific characteristics of the objective function in this problem, in the traditional CG methods, normalizing the initial step length to avoid getting stuck in bad conditioned points and improving the search direction by using angle condition to guarantee global convergence have been proposed. The results of numerical simulations obtained from the investigation of seven different improved CG methods with different angles in angle condition (four angles) and different initial step lengths (three step lengths), show the significant effect of the proposed modifications in reducing the number of iterations and the number of evaluations in different types of CG methods. This increases the efficiency of the PIGP method significantly, especially when the traditional CG algorithms fail in the optimization process, the improved algorithms perform well. Finally, in order to make it possible to implement the studies carried out in this paper for other parametric equations, the compiled package including the methods used in this paper is attached.</p>","PeriodicalId":11696,"journal":{"name":"Engineering with Computers","volume":null,"pages":null},"PeriodicalIF":8.7000,"publicationDate":"2024-04-08","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Engineering with Computers","FirstCategoryId":"5","ListUrlMain":"https://doi.org/10.1007/s00366-024-01970-8","RegionNum":2,"RegionCategory":"工程技术","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q1","JCRName":"Mathematics","Score":null,"Total":0}
引用次数: 0
Abstract
Today, Physics-informed machine learning (PIML) methods are one of the effective tools with high flexibility for solving inverse problems and operational equations. Among these methods, physics-informed learning model built upon Gaussian processes (PIGP) has a special place due to provide the posterior probabilistic distribution of their predictions in the context of Bayesian inference. In this method, the training phase to determine the optimal hyper parameters is equivalent to the optimization of a non-convex function called the likelihood function. Due to access the explicit form of the gradient, it is recommended to use conjugate gradient (CG) optimization algorithms. In addition, due to the necessity of computation of the determinant and inverse of the covariance matrix in each evaluation of the likelihood function, it is recommended to use CG methods in such a way that it can be completed in the minimum number of evaluations. In previous studies, only special form of CG method has been considered, which naturally will not have high efficiency. In this paper, the efficiency of the CG methods for optimization of the likelihood function in PIGP has been studied. The results of the numerical simulations show that the initial step length and search direction in CG methods have a significant effect on the number of evaluations of the likelihood function and consequently on the efficiency of the PIGP. Also, according to the specific characteristics of the objective function in this problem, in the traditional CG methods, normalizing the initial step length to avoid getting stuck in bad conditioned points and improving the search direction by using angle condition to guarantee global convergence have been proposed. The results of numerical simulations obtained from the investigation of seven different improved CG methods with different angles in angle condition (four angles) and different initial step lengths (three step lengths), show the significant effect of the proposed modifications in reducing the number of iterations and the number of evaluations in different types of CG methods. This increases the efficiency of the PIGP method significantly, especially when the traditional CG algorithms fail in the optimization process, the improved algorithms perform well. Finally, in order to make it possible to implement the studies carried out in this paper for other parametric equations, the compiled package including the methods used in this paper is attached.
期刊介绍:
Engineering with Computers is an international journal dedicated to simulation-based engineering. It features original papers and comprehensive reviews on technologies supporting simulation-based engineering, along with demonstrations of operational simulation-based engineering systems. The journal covers various technical areas such as adaptive simulation techniques, engineering databases, CAD geometry integration, mesh generation, parallel simulation methods, simulation frameworks, user interface technologies, and visualization techniques. It also encompasses a wide range of application areas where engineering technologies are applied, spanning from automotive industry applications to medical device design.