PottsMGNet:基于编码器-解码器的神经网络的数学解释

IF 2.1 3区 数学 Q3 COMPUTER SCIENCE, ARTIFICIAL INTELLIGENCE
Xue-Cheng Tai, Hao Liu, Raymond Chan
{"title":"PottsMGNet:基于编码器-解码器的神经网络的数学解释","authors":"Xue-Cheng Tai, Hao Liu, Raymond Chan","doi":"10.1137/23m1586355","DOIUrl":null,"url":null,"abstract":"SIAM Journal on Imaging Sciences, Volume 17, Issue 1, Page 540-594, March 2024. <br/> Abstract. For problems in image processing and many other fields, a large class of effective neural networks has encoder-decoder-based architectures. Although these networks have shown impressive performance, mathematical explanations of their architectures are still underdeveloped. In this paper, we study the encoder-decoder-based network architecture from the algorithmic perspective and provide a mathematical explanation. We use the two-phase Potts model for image segmentation as an example for our explanations. We associate the segmentation problem with a control problem in the continuous setting. Then, the continuous control model is time discretized by an operator-splitting scheme, the PottsMGNet, and space discretized by the multigrid method. We show that the resulting discrete PottsMGNet is equivalent to an encoder-decoder-based network. With minor modifications, it is shown that a number of the popular encoder-decoder-based neural networks are just instances of the proposed PottsMGNet. By incorporating the soft-threshold-dynamics into the PottsMGNet as a regularizer, the PottsMGNet has shown to be robust with the network parameters such as network width and depth and has achieved remarkable performance on datasets with very large noise. In nearly all our experiments, the new network always performs better than or as well as on accuracy and dice score compared to existing networks for image segmentation.","PeriodicalId":49528,"journal":{"name":"SIAM Journal on Imaging Sciences","volume":"123 1","pages":""},"PeriodicalIF":2.1000,"publicationDate":"2024-03-07","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":"{\"title\":\"PottsMGNet: A Mathematical Explanation of Encoder-Decoder Based Neural Networks\",\"authors\":\"Xue-Cheng Tai, Hao Liu, Raymond Chan\",\"doi\":\"10.1137/23m1586355\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"SIAM Journal on Imaging Sciences, Volume 17, Issue 1, Page 540-594, March 2024. <br/> Abstract. For problems in image processing and many other fields, a large class of effective neural networks has encoder-decoder-based architectures. Although these networks have shown impressive performance, mathematical explanations of their architectures are still underdeveloped. In this paper, we study the encoder-decoder-based network architecture from the algorithmic perspective and provide a mathematical explanation. We use the two-phase Potts model for image segmentation as an example for our explanations. We associate the segmentation problem with a control problem in the continuous setting. Then, the continuous control model is time discretized by an operator-splitting scheme, the PottsMGNet, and space discretized by the multigrid method. We show that the resulting discrete PottsMGNet is equivalent to an encoder-decoder-based network. With minor modifications, it is shown that a number of the popular encoder-decoder-based neural networks are just instances of the proposed PottsMGNet. By incorporating the soft-threshold-dynamics into the PottsMGNet as a regularizer, the PottsMGNet has shown to be robust with the network parameters such as network width and depth and has achieved remarkable performance on datasets with very large noise. In nearly all our experiments, the new network always performs better than or as well as on accuracy and dice score compared to existing networks for image segmentation.\",\"PeriodicalId\":49528,\"journal\":{\"name\":\"SIAM Journal on Imaging Sciences\",\"volume\":\"123 1\",\"pages\":\"\"},\"PeriodicalIF\":2.1000,\"publicationDate\":\"2024-03-07\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"0\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"SIAM Journal on Imaging Sciences\",\"FirstCategoryId\":\"100\",\"ListUrlMain\":\"https://doi.org/10.1137/23m1586355\",\"RegionNum\":3,\"RegionCategory\":\"数学\",\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"Q3\",\"JCRName\":\"COMPUTER SCIENCE, ARTIFICIAL INTELLIGENCE\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"SIAM Journal on Imaging Sciences","FirstCategoryId":"100","ListUrlMain":"https://doi.org/10.1137/23m1586355","RegionNum":3,"RegionCategory":"数学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q3","JCRName":"COMPUTER SCIENCE, ARTIFICIAL INTELLIGENCE","Score":null,"Total":0}
引用次数: 0

摘要

SIAM 影像科学杂志》第 17 卷第 1 期第 540-594 页,2024 年 3 月。 摘要对于图像处理和许多其他领域的问题,一大类有效的神经网络具有基于编码器-解码器的架构。尽管这些网络已显示出令人印象深刻的性能,但对其架构的数学解释仍然不够完善。在本文中,我们从算法的角度研究了基于编码器-解码器的网络架构,并给出了数学解释。我们以图像分割的两阶段 Potts 模型为例进行说明。我们将分割问题与连续环境下的控制问题联系起来。然后,利用算子分割方案 PottsMGNet 对连续控制模型进行时间离散化,并利用多网格法对其进行空间离散化。我们证明,离散 PottsMGNet 等价于基于编码器-解码器的网络。稍加修改后,我们就可以发现,许多流行的基于编码器-解码器的神经网络都是所提出的 PottsMGNet 的实例。通过在 PottsMGNet 中加入软阈值动力学作为正则化器,PottsMGNet 对网络宽度和深度等网络参数具有良好的鲁棒性,并在噪声非常大的数据集上取得了显著的性能。在我们几乎所有的实验中,与现有的图像分割网络相比,新网络在准确度和骰子得分上的表现总是优于或不相上下。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
PottsMGNet: A Mathematical Explanation of Encoder-Decoder Based Neural Networks
SIAM Journal on Imaging Sciences, Volume 17, Issue 1, Page 540-594, March 2024.
Abstract. For problems in image processing and many other fields, a large class of effective neural networks has encoder-decoder-based architectures. Although these networks have shown impressive performance, mathematical explanations of their architectures are still underdeveloped. In this paper, we study the encoder-decoder-based network architecture from the algorithmic perspective and provide a mathematical explanation. We use the two-phase Potts model for image segmentation as an example for our explanations. We associate the segmentation problem with a control problem in the continuous setting. Then, the continuous control model is time discretized by an operator-splitting scheme, the PottsMGNet, and space discretized by the multigrid method. We show that the resulting discrete PottsMGNet is equivalent to an encoder-decoder-based network. With minor modifications, it is shown that a number of the popular encoder-decoder-based neural networks are just instances of the proposed PottsMGNet. By incorporating the soft-threshold-dynamics into the PottsMGNet as a regularizer, the PottsMGNet has shown to be robust with the network parameters such as network width and depth and has achieved remarkable performance on datasets with very large noise. In nearly all our experiments, the new network always performs better than or as well as on accuracy and dice score compared to existing networks for image segmentation.
求助全文
通过发布文献求助,成功后即可免费获取论文全文。 去求助
来源期刊
SIAM Journal on Imaging Sciences
SIAM Journal on Imaging Sciences COMPUTER SCIENCE, ARTIFICIAL INTELLIGENCE-COMPUTER SCIENCE, SOFTWARE ENGINEERING
CiteScore
3.80
自引率
4.80%
发文量
58
审稿时长
>12 weeks
期刊介绍: SIAM Journal on Imaging Sciences (SIIMS) covers all areas of imaging sciences, broadly interpreted. It includes image formation, image processing, image analysis, image interpretation and understanding, imaging-related machine learning, and inverse problems in imaging; leading to applications to diverse areas in science, medicine, engineering, and other fields. The journal’s scope is meant to be broad enough to include areas now organized under the terms image processing, image analysis, computer graphics, computer vision, visual machine learning, and visualization. Formal approaches, at the level of mathematics and/or computations, as well as state-of-the-art practical results, are expected from manuscripts published in SIIMS. SIIMS is mathematically and computationally based, and offers a unique forum to highlight the commonality of methodology, models, and algorithms among diverse application areas of imaging sciences. SIIMS provides a broad authoritative source for fundamental results in imaging sciences, with a unique combination of mathematics and applications. SIIMS covers a broad range of areas, including but not limited to image formation, image processing, image analysis, computer graphics, computer vision, visualization, image understanding, pattern analysis, machine intelligence, remote sensing, geoscience, signal processing, medical and biomedical imaging, and seismic imaging. The fundamental mathematical theories addressing imaging problems covered by SIIMS include, but are not limited to, harmonic analysis, partial differential equations, differential geometry, numerical analysis, information theory, learning, optimization, statistics, and probability. Research papers that innovate both in the fundamentals and in the applications are especially welcome. SIIMS focuses on conceptually new ideas, methods, and fundamentals as applied to all aspects of imaging sciences.
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
确定
请完成安全验证×
copy
已复制链接
快去分享给好友吧!
我知道了
右上角分享
点击右上角分享
0
联系我们:info@booksci.cn Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。 Copyright © 2023 布克学术 All rights reserved.
京ICP备2023020795号-1
ghs 京公网安备 11010802042870号
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术官方微信