On the accuracy of interpolation based on single-layer artificial neural networks with a focus on defeating the Runge phenomenon

IF 3.1 3区计算机科学 Q2 COMPUTER SCIENCE, ARTIFICIAL INTELLIGENCE

Soft Computing Pub Date : 2024-07-29 DOI:10.1007/s00500-024-09918-2

Ferdinando Auricchio, Maria Roberta Belardo, Francesco Calabrò, Gianluca Fabiani, Ariel F. Pascaner

{"title":"On the accuracy of interpolation based on single-layer artificial neural networks with a focus on defeating the Runge phenomenon","authors":"Ferdinando Auricchio, Maria Roberta Belardo, Francesco Calabrò, Gianluca Fabiani, Ariel F. Pascaner","doi":"10.1007/s00500-024-09918-2","DOIUrl":null,"url":null,"abstract":"<p>Artificial Neural Networks (ANNs) are a tool in approximation theory widely used to solve interpolation problems. In fact, ANNs can be assimilated to functions since they take an input and return an output. The structure of the specifically adopted network determines the underlying approximation space, while the form of the function is selected by fixing the parameters of the network. In the present paper, we consider one-hidden layer ANNs with a feedforward architecture, also referred to as shallow or two-layer networks, so that the structure is determined by the number and types of neurons. The determination of the parameters that define the function, called training, is done via the resolution of the approximation problem, so by imposing the interpolation through a set of specific nodes. We present the case where the parameters are trained using a procedure that is referred to as Extreme Learning Machine (ELM) that leads to a linear interpolation problem. In such hypotheses, the existence of an ANN interpolating function is guaranteed. Given that the ANN is interpolating, the error incurred occurs outside the sampling interpolation nodes provided by the user. In this study, various choices of nodes are analyzed: equispaced, Chebychev, and randomly selected ones. Then, the focus is on regular target functions, for which it is known that interpolation can lead to spurious oscillations, a phenomenon that in the ANN literature is referred to as overfitting. We obtain good accuracy of the ANN interpolating function in all tested cases using these different types of interpolating nodes and different types of neurons. The following study is conducted starting from the well-known bell-shaped Runge example, which makes it clear that the construction of a global interpolating polynomial is accurate only if trained on suitably chosen nodes, ad example the Chebychev ones. In order to evaluate the behavior when the number of interpolation nodes increases, we increase the number of neurons in our network and compare it with the interpolating polynomial. We test using Runge’s function and other well-known examples with different regularities. As expected, the accuracy of the approximation with a global polynomial increases only if the Chebychev nodes are considered. Instead, the error for the ANN interpolating function always decays, and in most cases we observe that the convergence follows what is observed in the polynomial case on Chebychev nodes, despite the set of nodes used for training. Then we can conclude that the use of such an ANN defeats the Runge phenomenon. Our results show the power of ANNs to achieve excellent approximations when interpolating regular functions also starting from uniform and random nodes, particularly for Runge’s function.</p>","PeriodicalId":22039,"journal":{"name":"Soft Computing","volume":"33 1","pages":""},"PeriodicalIF":3.1000,"publicationDate":"2024-07-29","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Soft Computing","FirstCategoryId":"94","ListUrlMain":"https://doi.org/10.1007/s00500-024-09918-2","RegionNum":3,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q2","JCRName":"COMPUTER SCIENCE, ARTIFICIAL INTELLIGENCE","Score":null,"Total":0}

引用次数: 0

Abstract

Artificial Neural Networks (ANNs) are a tool in approximation theory widely used to solve interpolation problems. In fact, ANNs can be assimilated to functions since they take an input and return an output. The structure of the specifically adopted network determines the underlying approximation space, while the form of the function is selected by fixing the parameters of the network. In the present paper, we consider one-hidden layer ANNs with a feedforward architecture, also referred to as shallow or two-layer networks, so that the structure is determined by the number and types of neurons. The determination of the parameters that define the function, called training, is done via the resolution of the approximation problem, so by imposing the interpolation through a set of specific nodes. We present the case where the parameters are trained using a procedure that is referred to as Extreme Learning Machine (ELM) that leads to a linear interpolation problem. In such hypotheses, the existence of an ANN interpolating function is guaranteed. Given that the ANN is interpolating, the error incurred occurs outside the sampling interpolation nodes provided by the user. In this study, various choices of nodes are analyzed: equispaced, Chebychev, and randomly selected ones. Then, the focus is on regular target functions, for which it is known that interpolation can lead to spurious oscillations, a phenomenon that in the ANN literature is referred to as overfitting. We obtain good accuracy of the ANN interpolating function in all tested cases using these different types of interpolating nodes and different types of neurons. The following study is conducted starting from the well-known bell-shaped Runge example, which makes it clear that the construction of a global interpolating polynomial is accurate only if trained on suitably chosen nodes, ad example the Chebychev ones. In order to evaluate the behavior when the number of interpolation nodes increases, we increase the number of neurons in our network and compare it with the interpolating polynomial. We test using Runge’s function and other well-known examples with different regularities. As expected, the accuracy of the approximation with a global polynomial increases only if the Chebychev nodes are considered. Instead, the error for the ANN interpolating function always decays, and in most cases we observe that the convergence follows what is observed in the polynomial case on Chebychev nodes, despite the set of nodes used for training. Then we can conclude that the use of such an ANN defeats the Runge phenomenon. Our results show the power of ANNs to achieve excellent approximations when interpolating regular functions also starting from uniform and random nodes, particularly for Runge’s function.

Abstract Image

查看原文本刊更多论文

基于单层人工神经网络的插值精度，重点是克服伦格现象

人工神经网络（ANN）是近似理论中的一种工具，广泛用于解决插值问题。事实上，人工神经网络可以与函数等价，因为它们接受输入并返回输出。具体采用的网络结构决定了基本的近似空间，而函数的形式则通过固定网络参数来选择。在本文中，我们考虑的是具有前馈结构的单隐层 ANN，也称为浅层或双层网络，因此其结构由神经元的数量和类型决定。定义函数的参数的确定（称为训练）是通过近似问题的解决来完成的，即通过一组特定节点进行插值。我们介绍的情况是，使用一种被称为极限学习机（ELM）的程序来训练参数，从而解决线性插值问题。在这种假设中，ANN 插值函数的存在是有保证的。鉴于 ANN 正在进行插值，所产生的误差发生在用户提供的采样插值节点之外。本研究分析了各种节点选择：等距节点、切比切夫节点和随机选择的节点。然后，重点放在规则目标函数上，众所周知，插值会导致虚假振荡，这种现象在 ANN 文献中被称为过拟合。我们使用这些不同类型的插值节点和不同类型的神经元，在所有测试案例中都获得了良好的 ANN 插值函数精度。下面的研究将从著名的钟形 Runge 例子开始，该例子清楚地表明，只有在适当选择节点（例如切比切夫节点）的情况下，全局内插多项式的构建才会准确。为了评估插值节点数量增加时的行为，我们增加了网络中神经元的数量，并与插值多项式进行比较。我们使用 Runge 函数和其他具有不同规则性的著名例子进行了测试。不出所料，只有在考虑到切比切夫节点的情况下，全局多项式的近似精度才会提高。相反，ANN 插值函数的误差总是在减小，而且在大多数情况下，尽管使用了一组节点进行训练，但我们观察到其收敛性与在切比切夫节点上的多项式情况下观察到的收敛性相同。因此，我们可以得出这样的结论：使用这种 ANN 可以消除 Runge 现象。我们的研究结果表明，在对规则函数进行插值时，ANN 也能从均匀节点和随机节点出发，实现出色的近似，尤其是对 Runge 函数。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

求助全文

约1分钟内获得全文求助全文

来源期刊

Soft Computing 工程技术-计算机：跨学科应用

CiteScore

8.10

自引率

9.80%

发文量

927

审稿时长

7.3 months

期刊介绍： Soft Computing is dedicated to system solutions based on soft computing techniques. It provides rapid dissemination of important results in soft computing technologies, a fusion of research in evolutionary algorithms and genetic programming, neural science and neural net systems, fuzzy set theory and fuzzy systems, and chaos theory and chaotic systems. Soft Computing encourages the integration of soft computing techniques and tools into both everyday and advanced applications. By linking the ideas and techniques of soft computing with other disciplines, the journal serves as a unifying platform that fosters comparisons, extensions, and new applications. As a result, the journal is an international forum for all scientists and engineers engaged in research and development in this fast growing field.