PottsMGNet：基于编码器-解码器的神经网络的数学解释

IF 2.1 3区数学 Q3 COMPUTER SCIENCE, ARTIFICIAL INTELLIGENCE

SIAM Journal on Imaging Sciences Pub Date : 2024-03-07 DOI:10.1137/23m1586355

Xue-Cheng Tai, Hao Liu, Raymond Chan

{"title":"PottsMGNet：基于编码器-解码器的神经网络的数学解释","authors":"Xue-Cheng Tai, Hao Liu, Raymond Chan","doi":"10.1137/23m1586355","DOIUrl":null,"url":null,"abstract":"SIAM Journal on Imaging Sciences, Volume 17, Issue 1, Page 540-594, March 2024. <br/> Abstract. For problems in image processing and many other fields, a large class of effective neural networks has encoder-decoder-based architectures. Although these networks have shown impressive performance, mathematical explanations of their architectures are still underdeveloped. In this paper, we study the encoder-decoder-based network architecture from the algorithmic perspective and provide a mathematical explanation. We use the two-phase Potts model for image segmentation as an example for our explanations. We associate the segmentation problem with a control problem in the continuous setting. Then, the continuous control model is time discretized by an operator-splitting scheme, the PottsMGNet, and space discretized by the multigrid method. We show that the resulting discrete PottsMGNet is equivalent to an encoder-decoder-based network. With minor modifications, it is shown that a number of the popular encoder-decoder-based neural networks are just instances of the proposed PottsMGNet. By incorporating the soft-threshold-dynamics into the PottsMGNet as a regularizer, the PottsMGNet has shown to be robust with the network parameters such as network width and depth and has achieved remarkable performance on datasets with very large noise. In nearly all our experiments, the new network always performs better than or as well as on accuracy and dice score compared to existing networks for image segmentation.","PeriodicalId":49528,"journal":{"name":"SIAM Journal on Imaging Sciences","volume":"123 1","pages":""},"PeriodicalIF":2.1000,"publicationDate":"2024-03-07","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":"{\"title\":\"PottsMGNet: A Mathematical Explanation of Encoder-Decoder Based Neural Networks\",\"authors\":\"Xue-Cheng Tai, Hao Liu, Raymond Chan\",\"doi\":\"10.1137/23m1586355\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"SIAM Journal on Imaging Sciences, Volume 17, Issue 1, Page 540-594, March 2024. <br/> Abstract. For problems in image processing and many other fields, a large class of effective neural networks has encoder-decoder-based architectures. Although these networks have shown impressive performance, mathematical explanations of their architectures are still underdeveloped. In this paper, we study the encoder-decoder-based network architecture from the algorithmic perspective and provide a mathematical explanation. We use the two-phase Potts model for image segmentation as an example for our explanations. We associate the segmentation problem with a control problem in the continuous setting. Then, the continuous control model is time discretized by an operator-splitting scheme, the PottsMGNet, and space discretized by the multigrid method. We show that the resulting discrete PottsMGNet is equivalent to an encoder-decoder-based network. With minor modifications, it is shown that a number of the popular encoder-decoder-based neural networks are just instances of the proposed PottsMGNet. By incorporating the soft-threshold-dynamics into the PottsMGNet as a regularizer, the PottsMGNet has shown to be robust with the network parameters such as network width and depth and has achieved remarkable performance on datasets with very large noise. In nearly all our experiments, the new network always performs better than or as well as on accuracy and dice score compared to existing networks for image segmentation.\",\"PeriodicalId\":49528,\"journal\":{\"name\":\"SIAM Journal on Imaging Sciences\",\"volume\":\"123 1\",\"pages\":\"\"},\"PeriodicalIF\":2.1000,\"publicationDate\":\"2024-03-07\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"0\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"SIAM Journal on Imaging Sciences\",\"FirstCategoryId\":\"100\",\"ListUrlMain\":\"https://doi.org/10.1137/23m1586355\",\"RegionNum\":3,\"RegionCategory\":\"数学\",\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"Q3\",\"JCRName\":\"COMPUTER SCIENCE, ARTIFICIAL INTELLIGENCE\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"SIAM Journal on Imaging Sciences","FirstCategoryId":"100","ListUrlMain":"https://doi.org/10.1137/23m1586355","RegionNum":3,"RegionCategory":"数学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q3","JCRName":"COMPUTER SCIENCE, ARTIFICIAL INTELLIGENCE","Score":null,"Total":0}

引用次数: 0

摘要

SIAM 影像科学杂志》第 17 卷第 1 期第 540-594 页，2024 年 3 月。摘要对于图像处理和许多其他领域的问题，一大类有效的神经网络具有基于编码器-解码器的架构。尽管这些网络已显示出令人印象深刻的性能，但对其架构的数学解释仍然不够完善。在本文中，我们从算法的角度研究了基于编码器-解码器的网络架构，并给出了数学解释。我们以图像分割的两阶段 Potts 模型为例进行说明。我们将分割问题与连续环境下的控制问题联系起来。然后，利用算子分割方案 PottsMGNet 对连续控制模型进行时间离散化，并利用多网格法对其进行空间离散化。我们证明，离散 PottsMGNet 等价于基于编码器-解码器的网络。稍加修改后，我们就可以发现，许多流行的基于编码器-解码器的神经网络都是所提出的 PottsMGNet 的实例。通过在 PottsMGNet 中加入软阈值动力学作为正则化器，PottsMGNet 对网络宽度和深度等网络参数具有良好的鲁棒性，并在噪声非常大的数据集上取得了显著的性能。在我们几乎所有的实验中，与现有的图像分割网络相比，新网络在准确度和骰子得分上的表现总是优于或不相上下。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

查看原文本刊更多论文

PottsMGNet: A Mathematical Explanation of Encoder-Decoder Based Neural Networks

SIAM Journal on Imaging Sciences, Volume 17, Issue 1, Page 540-594, March 2024.
Abstract. For problems in image processing and many other fields, a large class of effective neural networks has encoder-decoder-based architectures. Although these networks have shown impressive performance, mathematical explanations of their architectures are still underdeveloped. In this paper, we study the encoder-decoder-based network architecture from the algorithmic perspective and provide a mathematical explanation. We use the two-phase Potts model for image segmentation as an example for our explanations. We associate the segmentation problem with a control problem in the continuous setting. Then, the continuous control model is time discretized by an operator-splitting scheme, the PottsMGNet, and space discretized by the multigrid method. We show that the resulting discrete PottsMGNet is equivalent to an encoder-decoder-based network. With minor modifications, it is shown that a number of the popular encoder-decoder-based neural networks are just instances of the proposed PottsMGNet. By incorporating the soft-threshold-dynamics into the PottsMGNet as a regularizer, the PottsMGNet has shown to be robust with the network parameters such as network width and depth and has achieved remarkable performance on datasets with very large noise. In nearly all our experiments, the new network always performs better than or as well as on accuracy and dice score compared to existing networks for image segmentation.

求助全文

通过发布文献求助，成功后即可免费获取论文全文。去求助

来源期刊

SIAM Journal on Imaging Sciences COMPUTER SCIENCE, ARTIFICIAL INTELLIGENCE-COMPUTER SCIENCE, SOFTWARE ENGINEERING

CiteScore

3.80

自引率

4.80%

发文量

审稿时长

>12 weeks

期刊介绍： SIAM Journal on Imaging Sciences (SIIMS) covers all areas of imaging sciences, broadly interpreted. It includes image formation, image processing, image analysis, image interpretation and understanding, imaging-related machine learning, and inverse problems in imaging; leading to applications to diverse areas in science, medicine, engineering, and other fields. The journal’s scope is meant to be broad enough to include areas now organized under the terms image processing, image analysis, computer graphics, computer vision, visual machine learning, and visualization. Formal approaches, at the level of mathematics and/or computations, as well as state-of-the-art practical results, are expected from manuscripts published in SIIMS. SIIMS is mathematically and computationally based, and offers a unique forum to highlight the commonality of methodology, models, and algorithms among diverse application areas of imaging sciences. SIIMS provides a broad authoritative source for fundamental results in imaging sciences, with a unique combination of mathematics and applications. SIIMS covers a broad range of areas, including but not limited to image formation, image processing, image analysis, computer graphics, computer vision, visualization, image understanding, pattern analysis, machine intelligence, remote sensing, geoscience, signal processing, medical and biomedical imaging, and seismic imaging. The fundamental mathematical theories addressing imaging problems covered by SIIMS include, but are not limited to, harmonic analysis, partial differential equations, differential geometry, numerical analysis, information theory, learning, optimization, statistics, and probability. Research papers that innovate both in the fundamentals and in the applications are especially welcome. SIIMS focuses on conceptually new ideas, methods, and fundamentals as applied to all aspects of imaging sciences.