Connections Between Numerical Algorithms for PDEs and Neural Networks.

IF 1.5 4区数学 Q4 COMPUTER SCIENCE, ARTIFICIAL INTELLIGENCE

Journal of Mathematical Imaging and Vision Pub Date : 2023-01-01 Epub Date: 2022-06-24 DOI:10.1007/s10851-022-01106-x

Tobias Alt, Karl Schrader, Matthias Augustin, Pascal Peter, Joachim Weickert

{"title":"Connections Between Numerical Algorithms for PDEs and Neural Networks.","authors":"Tobias Alt, Karl Schrader, Matthias Augustin, Pascal Peter, Joachim Weickert","doi":"10.1007/s10851-022-01106-x","DOIUrl":null,"url":null,"abstract":"<p><p>We investigate numerous structural connections between numerical algorithms for partial differential equations (PDEs) and neural architectures. Our goal is to transfer the rich set of mathematical foundations from the world of PDEs to neural networks. Besides structural insights, we provide concrete examples and experimental evaluations of the resulting architectures. Using the example of generalised nonlinear diffusion in 1D, we consider explicit schemes, acceleration strategies thereof, implicit schemes, and multigrid approaches. We connect these concepts to residual networks, recurrent neural networks, and U-net architectures. Our findings inspire a symmetric residual network design with provable stability guarantees and justify the effectiveness of skip connections in neural networks from a numerical perspective. Moreover, we present U-net architectures that implement multigrid techniques for learning efficient solutions of partial differential equation models, and motivate uncommon design choices such as trainable nonmonotone activation functions. Experimental evaluations show that the proposed architectures save half of the trainable parameters and can thus outperform standard ones with the same model complexity. Our considerations serve as a basis for explaining the success of popular neural architectures and provide a blueprint for developing new mathematically well-founded neural building blocks.</p>","PeriodicalId":16196,"journal":{"name":"Journal of Mathematical Imaging and Vision","volume":"65 1","pages":"185-208"},"PeriodicalIF":1.5000,"publicationDate":"2023-01-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC9883332/pdf/","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Journal of Mathematical Imaging and Vision","FirstCategoryId":"100","ListUrlMain":"https://doi.org/10.1007/s10851-022-01106-x","RegionNum":4,"RegionCategory":"数学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"2022/6/24 0:00:00","PubModel":"Epub","JCR":"Q4","JCRName":"COMPUTER SCIENCE, ARTIFICIAL INTELLIGENCE","Score":null,"Total":0}

引用次数: 0

Abstract

We investigate numerous structural connections between numerical algorithms for partial differential equations (PDEs) and neural architectures. Our goal is to transfer the rich set of mathematical foundations from the world of PDEs to neural networks. Besides structural insights, we provide concrete examples and experimental evaluations of the resulting architectures. Using the example of generalised nonlinear diffusion in 1D, we consider explicit schemes, acceleration strategies thereof, implicit schemes, and multigrid approaches. We connect these concepts to residual networks, recurrent neural networks, and U-net architectures. Our findings inspire a symmetric residual network design with provable stability guarantees and justify the effectiveness of skip connections in neural networks from a numerical perspective. Moreover, we present U-net architectures that implement multigrid techniques for learning efficient solutions of partial differential equation models, and motivate uncommon design choices such as trainable nonmonotone activation functions. Experimental evaluations show that the proposed architectures save half of the trainable parameters and can thus outperform standard ones with the same model complexity. Our considerations serve as a basis for explaining the success of popular neural architectures and provide a blueprint for developing new mathematically well-founded neural building blocks.

Abstract Image

查看原文本刊更多论文

PDE 数值算法与神经网络之间的联系。

我们研究偏微分方程（PDEs）数值算法与神经架构之间的众多结构联系。我们的目标是将偏微分方程领域丰富的数学基础移植到神经网络中。除了结构方面的见解外，我们还提供了具体的例子，并对由此产生的架构进行了实验评估。以一维广义非线性扩散为例，我们考虑了显式方案、其加速策略、隐式方案和多网格方法。我们将这些概念与残差网络、递归神经网络和 U-net 架构联系起来。我们的发现启发了具有可证明稳定性保证的对称残差网络设计，并从数值角度证明了神经网络中跳过连接的有效性。此外，我们还提出了 U-net 架构，该架构采用多网格技术来学习偏微分方程模型的高效解，并激发了不常见的设计选择，如可训练的非单调激活函数。实验评估表明，所提出的架构节省了一半的可训练参数，因此在模型复杂度相同的情况下，其性能优于标准架构。我们的考虑为解释流行神经架构的成功奠定了基础，并为开发新的数学基础良好的神经构建模块提供了蓝图。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

求助全文

约1分钟内获得全文求助全文

来源期刊

Journal of Mathematical Imaging and Vision 工程技术-计算机：人工智能

CiteScore

4.30

自引率

5.00%

发文量

审稿时长

3.3 months

期刊介绍： The Journal of Mathematical Imaging and Vision is a technical journal publishing important new developments in mathematical imaging. The journal publishes research articles, invited papers, and expository articles. Current developments in new image processing hardware, the advent of multisensor data fusion, and rapid advances in vision research have led to an explosive growth in the interdisciplinary field of imaging science. This growth has resulted in the development of highly sophisticated mathematical models and theories. The journal emphasizes the role of mathematics as a rigorous basis for imaging science. This provides a sound alternative to present journals in this area. Contributions are judged on the basis of mathematical content. Articles may be physically speculative but need to be mathematically sound. Emphasis is placed on innovative or established mathematical techniques applied to vision and imaging problems in a novel way, as well as new developments and problems in mathematics arising from these applications. The scope of the journal includes: computational models of vision; imaging algebra and mathematical morphology mathematical methods in reconstruction, compactification, and coding filter theory probabilistic, statistical, geometric, topological, and fractal techniques and models in imaging science inverse optics wave theory. Specific application areas of interest include, but are not limited to: all aspects of image formation and representation medical, biological, industrial, geophysical, astronomical and military imaging image analysis and image understanding parallel and distributed computing computer vision architecture design.