求解偏微分方程的物理信息神经网络中可学习的激活函数

IF 7.2 2区物理与天体物理 Q1 COMPUTER SCIENCE, INTERDISCIPLINARY APPLICATIONS

Computer Physics Communications Pub Date : 2025-07-15 DOI:10.1016/j.cpc.2025.109753

Afrah Farea, Mustafa Serdar Celebi

{"title":"求解偏微分方程的物理信息神经网络中可学习的激活函数","authors":"Afrah Farea, Mustafa Serdar Celebi","doi":"10.1016/j.cpc.2025.109753","DOIUrl":null,"url":null,"abstract":"<div><div>Physics-Informed Neural Networks (PINNs) have emerged as a promising approach for solving Partial Differential Equations (PDEs). However, they face challenges related to spectral bias (the tendency to learn low-frequency components while struggling with high-frequency features) and unstable convergence dynamics (mainly stemming from the multi-objective nature of the PINN loss function). These limitations impact their accuracy for solving problems involving rapid oscillations, sharp gradients, and complex boundary behaviors. We systematically investigate learnable activation functions as a solution to these challenges, comparing Multilayer Perceptrons (MLPs) using fixed and learnable activation functions against Kolmogorov-Arnold Networks (KANs) that employ learnable basis functions. Our evaluation spans diverse PDE types, including linear and non-linear wave problems, mixed-physics systems, and fluid dynamics. Using empirical Neural Tangent Kernel (NTK) analysis and Hessian eigenvalue decomposition, we assess spectral bias and convergence stability of the models. Our results reveal a trade-off between expressivity and training convergence stability. While learnable activation functions work well in simpler architectures, they encounter scalability issues in complex networks due to the higher functional dimensionality. Counterintuitively, we find that low spectral bias alone does not guarantee better accuracy, as functions with broader NTK eigenvalue spectra may exhibit convergence instability. We demonstrate that activation function selection remains inherently problem-specific, with different bases showing distinct advantages for particular PDE characteristics. We believe these insights will help in the design of more robust neural PDE solvers.</div></div>","PeriodicalId":285,"journal":{"name":"Computer Physics Communications","volume":"315 ","pages":"Article 109753"},"PeriodicalIF":7.2000,"publicationDate":"2025-07-15","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":"{\"title\":\"Learnable activation functions in physics-informed neural networks for solving partial differential equations\",\"authors\":\"Afrah Farea, Mustafa Serdar Celebi\",\"doi\":\"10.1016/j.cpc.2025.109753\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"<div><div>Physics-Informed Neural Networks (PINNs) have emerged as a promising approach for solving Partial Differential Equations (PDEs). However, they face challenges related to spectral bias (the tendency to learn low-frequency components while struggling with high-frequency features) and unstable convergence dynamics (mainly stemming from the multi-objective nature of the PINN loss function). These limitations impact their accuracy for solving problems involving rapid oscillations, sharp gradients, and complex boundary behaviors. We systematically investigate learnable activation functions as a solution to these challenges, comparing Multilayer Perceptrons (MLPs) using fixed and learnable activation functions against Kolmogorov-Arnold Networks (KANs) that employ learnable basis functions. Our evaluation spans diverse PDE types, including linear and non-linear wave problems, mixed-physics systems, and fluid dynamics. Using empirical Neural Tangent Kernel (NTK) analysis and Hessian eigenvalue decomposition, we assess spectral bias and convergence stability of the models. Our results reveal a trade-off between expressivity and training convergence stability. While learnable activation functions work well in simpler architectures, they encounter scalability issues in complex networks due to the higher functional dimensionality. Counterintuitively, we find that low spectral bias alone does not guarantee better accuracy, as functions with broader NTK eigenvalue spectra may exhibit convergence instability. We demonstrate that activation function selection remains inherently problem-specific, with different bases showing distinct advantages for particular PDE characteristics. We believe these insights will help in the design of more robust neural PDE solvers.</div></div>\",\"PeriodicalId\":285,\"journal\":{\"name\":\"Computer Physics Communications\",\"volume\":\"315 \",\"pages\":\"Article 109753\"},\"PeriodicalIF\":7.2000,\"publicationDate\":\"2025-07-15\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"0\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"Computer Physics Communications\",\"FirstCategoryId\":\"101\",\"ListUrlMain\":\"https://www.sciencedirect.com/science/article/pii/S0010465525002553\",\"RegionNum\":2,\"RegionCategory\":\"物理与天体物理\",\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"Q1\",\"JCRName\":\"COMPUTER SCIENCE, INTERDISCIPLINARY APPLICATIONS\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"Computer Physics Communications","FirstCategoryId":"101","ListUrlMain":"https://www.sciencedirect.com/science/article/pii/S0010465525002553","RegionNum":2,"RegionCategory":"物理与天体物理","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q1","JCRName":"COMPUTER SCIENCE, INTERDISCIPLINARY APPLICATIONS","Score":null,"Total":0}

引用次数: 0

摘要

物理信息神经网络（pinn）已经成为求解偏微分方程（PDEs）的一种很有前途的方法。然而，他们面临着与频谱偏差（在努力学习高频特征的同时倾向于学习低频成分）和不稳定的收敛动力学（主要源于PINN损失函数的多目标性质）相关的挑战。这些限制影响了它们在解决涉及快速振荡、急剧梯度和复杂边界行为的问题时的准确性。我们系统地研究了可学习激活函数作为这些挑战的解决方案，将使用固定和可学习激活函数的多层感知器（mlp）与使用可学习基函数的Kolmogorov-Arnold网络（KANs）进行了比较。我们的评估涵盖了多种PDE类型，包括线性和非线性波动问题、混合物理系统和流体动力学。利用经验神经切线核（NTK）分析和Hessian特征值分解，我们评估了模型的谱偏差和收敛稳定性。我们的结果揭示了表达性和训练收敛稳定性之间的权衡。虽然可学习激活函数在较简单的体系结构中工作得很好，但由于功能维度较高，它们在复杂网络中会遇到可伸缩性问题。与直觉相反，我们发现仅靠低谱偏并不能保证更好的精度，因为具有更宽的NTK特征值谱的函数可能表现出收敛不稳定性。我们证明了激活函数的选择仍然具有固有的问题特异性，对于特定的PDE特征，不同的碱基显示出不同的优势。我们相信这些见解将有助于设计更健壮的神经PDE求解器。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

查看原文本刊更多论文

Learnable activation functions in physics-informed neural networks for solving partial differential equations

Physics-Informed Neural Networks (PINNs) have emerged as a promising approach for solving Partial Differential Equations (PDEs). However, they face challenges related to spectral bias (the tendency to learn low-frequency components while struggling with high-frequency features) and unstable convergence dynamics (mainly stemming from the multi-objective nature of the PINN loss function). These limitations impact their accuracy for solving problems involving rapid oscillations, sharp gradients, and complex boundary behaviors. We systematically investigate learnable activation functions as a solution to these challenges, comparing Multilayer Perceptrons (MLPs) using fixed and learnable activation functions against Kolmogorov-Arnold Networks (KANs) that employ learnable basis functions. Our evaluation spans diverse PDE types, including linear and non-linear wave problems, mixed-physics systems, and fluid dynamics. Using empirical Neural Tangent Kernel (NTK) analysis and Hessian eigenvalue decomposition, we assess spectral bias and convergence stability of the models. Our results reveal a trade-off between expressivity and training convergence stability. While learnable activation functions work well in simpler architectures, they encounter scalability issues in complex networks due to the higher functional dimensionality. Counterintuitively, we find that low spectral bias alone does not guarantee better accuracy, as functions with broader NTK eigenvalue spectra may exhibit convergence instability. We demonstrate that activation function selection remains inherently problem-specific, with different bases showing distinct advantages for particular PDE characteristics. We believe these insights will help in the design of more robust neural PDE solvers.

求助全文

通过发布文献求助，成功后即可免费获取论文全文。去求助

来源期刊

Computer Physics Communications 物理-计算机：跨学科应用

CiteScore

12.10

自引率

3.20%

发文量

287

审稿时长

5.3 months

期刊介绍： The focus of CPC is on contemporary computational methods and techniques and their implementation, the effectiveness of which will normally be evidenced by the author(s) within the context of a substantive problem in physics. Within this setting CPC publishes two types of paper. Computer Programs in Physics (CPiP) These papers describe significant computer programs to be archived in the CPC Program Library which is held in the Mendeley Data repository. The submitted software must be covered by an approved open source licence. Papers and associated computer programs that address a problem of contemporary interest in physics that cannot be solved by current software are particularly encouraged. Computational Physics Papers (CP) These are research papers in, but are not limited to, the following themes across computational physics and related disciplines. mathematical and numerical methods and algorithms; computational models including those associated with the design, control and analysis of experiments; and algebraic computation. Each will normally include software implementation and performance details. The software implementation should, ideally, be available via GitHub, Zenodo or an institutional repository.In addition, research papers on the impact of advanced computer architecture and special purpose computers on computing in the physical sciences and software topics related to, and of importance in, the physical sciences may be considered.