On spectral bias reduction of multi-scale neural networks for regression problems

IF 6 1区计算机科学 Q1 COMPUTER SCIENCE, ARTIFICIAL INTELLIGENCE

Neural Networks Pub Date : 2025-01-21 DOI:10.1016/j.neunet.2025.107179

Bo Wang , Heng Yuan , Lizuo Liu , Wenzhong Zhang , Wei Cai

{"title":"On spectral bias reduction of multi-scale neural networks for regression problems","authors":"Bo Wang , Heng Yuan , Lizuo Liu , Wenzhong Zhang , Wei Cai","doi":"10.1016/j.neunet.2025.107179","DOIUrl":null,"url":null,"abstract":"<div><div>In this paper, we derive diffusion equation models in the spectral domain to study the evolution of the training error of two-layer multiscale deep neural networks (MscaleDNN) (Cai and Xu, 2019; Liu et al., 2020), which is designed to reduce the spectral bias of fully connected deep neural networks in approximating oscillatory functions. The diffusion models are obtained from the spectral form of the error equation of the MscaleDNN, derived with a neural tangent kernel approach and gradient descent training and a sine activation function, assuming a vanishing learning rate and infinite network width and domain size. The involved diffusion coefficients are shown to have larger supports if more scales are used in the MscaleDNN, and thus, the proposed diffusion equation models in the frequency domain explain the MscaleDNN’s spectral bias reduction capability. The diffusion model in the Fourier-spectral domain allows us to understand clearly the training error decay for different Fourier-frequencies. The numerical results of the diffusion models for a two-layer MscaleDNN training match the error evolution of the actual gradient descent training with a reasonably large network width, thus validating the effectiveness of the diffusion models. Meanwhile, the numerical results for MscaleDNN show error decay over a wide frequency range and confirm the advantage of using MscaleDNN to approximate functions with a wide range of frequencies.</div></div>","PeriodicalId":49763,"journal":{"name":"Neural Networks","volume":"185 ","pages":"Article 107179"},"PeriodicalIF":6.0000,"publicationDate":"2025-01-21","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Neural Networks","FirstCategoryId":"94","ListUrlMain":"https://www.sciencedirect.com/science/article/pii/S0893608025000589","RegionNum":1,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q1","JCRName":"COMPUTER SCIENCE, ARTIFICIAL INTELLIGENCE","Score":null,"Total":0}

引用次数: 0

Abstract

In this paper, we derive diffusion equation models in the spectral domain to study the evolution of the training error of two-layer multiscale deep neural networks (MscaleDNN) (Cai and Xu, 2019; Liu et al., 2020), which is designed to reduce the spectral bias of fully connected deep neural networks in approximating oscillatory functions. The diffusion models are obtained from the spectral form of the error equation of the MscaleDNN, derived with a neural tangent kernel approach and gradient descent training and a sine activation function, assuming a vanishing learning rate and infinite network width and domain size. The involved diffusion coefficients are shown to have larger supports if more scales are used in the MscaleDNN, and thus, the proposed diffusion equation models in the frequency domain explain the MscaleDNN’s spectral bias reduction capability. The diffusion model in the Fourier-spectral domain allows us to understand clearly the training error decay for different Fourier-frequencies. The numerical results of the diffusion models for a two-layer MscaleDNN training match the error evolution of the actual gradient descent training with a reasonably large network width, thus validating the effectiveness of the diffusion models. Meanwhile, the numerical results for MscaleDNN show error decay over a wide frequency range and confirm the advantage of using MscaleDNN to approximate functions with a wide range of frequencies.

查看原文本刊更多论文

求助全文

约1分钟内获得全文求助全文

来源期刊

Neural Networks 工程技术-计算机：人工智能

CiteScore

13.90

自引率

7.70%

发文量

425

审稿时长

67 days

期刊介绍： Neural Networks is a platform that aims to foster an international community of scholars and practitioners interested in neural networks, deep learning, and other approaches to artificial intelligence and machine learning. Our journal invites submissions covering various aspects of neural networks research, from computational neuroscience and cognitive modeling to mathematical analyses and engineering applications. By providing a forum for interdisciplinary discussions between biology and technology, we aim to encourage the development of biologically-inspired artificial intelligence.