Notes on Ridge Functions and Neural Networks

Mathematics eJournal Pub Date : 2020-05-28 DOI:10.2139/ssrn.3618165

V. Ismailov

{"title":"Notes on Ridge Functions and Neural Networks","authors":"V. Ismailov","doi":"10.2139/ssrn.3618165","DOIUrl":null,"url":null,"abstract":"These notes are about ridge functions. Recent years have witnessed a flurry of interest in these functions. Ridge functions appear in various fields and under various guises. They appear in fields as diverse as partial differential equations (where they are called plane waves), computerized tomography and statistics. These functions are also the underpinnings of many central models in neural networks. \nWe are interested in ridge functions from the point of view of approximation theory. The basic goal in approximation theory is to approximate complicated objects by simpler objects. Among many classes of multivariate functions, linear combinations of ridge functions are a class of simpler functions. These notes study some problems of approximation of multivariate functions by linear combinations of ridge functions. We present here various properties of these functions. The questions we ask are as follows. When can a multivariate function be expressed as a linear combination of ridge functions from a certain class? When do such linear combinations represent each multivariate function? If a precise representation is not possible, can one approximate arbitrarily well? If well approximation fails, how can one compute/estimate the error of approximation, know that a best approximation exists? How can one characterize and construct best approximations? If a smooth function is a sum of arbitrarily behaved ridge functions, can it be expressed as a sum of smooth ridge functions? We also study properties of generalized ridge functions, which are very much related to linear superpositions and Kolmogorov's famous superposition theorem. These notes end with a few applications of ridge functions to the problem of approximation by single and two hidden layer neural networks with a restricted set of weights. \nWe hope that these notes will be useful and interesting to both researchers and students.","PeriodicalId":260073,"journal":{"name":"Mathematics eJournal","volume":"21 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2020-05-28","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"5","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Mathematics eJournal","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.2139/ssrn.3618165","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}

引用次数: 5

Abstract

These notes are about ridge functions. Recent years have witnessed a flurry of interest in these functions. Ridge functions appear in various fields and under various guises. They appear in fields as diverse as partial differential equations (where they are called plane waves), computerized tomography and statistics. These functions are also the underpinnings of many central models in neural networks. We are interested in ridge functions from the point of view of approximation theory. The basic goal in approximation theory is to approximate complicated objects by simpler objects. Among many classes of multivariate functions, linear combinations of ridge functions are a class of simpler functions. These notes study some problems of approximation of multivariate functions by linear combinations of ridge functions. We present here various properties of these functions. The questions we ask are as follows. When can a multivariate function be expressed as a linear combination of ridge functions from a certain class? When do such linear combinations represent each multivariate function? If a precise representation is not possible, can one approximate arbitrarily well? If well approximation fails, how can one compute/estimate the error of approximation, know that a best approximation exists? How can one characterize and construct best approximations? If a smooth function is a sum of arbitrarily behaved ridge functions, can it be expressed as a sum of smooth ridge functions? We also study properties of generalized ridge functions, which are very much related to linear superpositions and Kolmogorov's famous superposition theorem. These notes end with a few applications of ridge functions to the problem of approximation by single and two hidden layer neural networks with a restricted set of weights. We hope that these notes will be useful and interesting to both researchers and students.

查看原文本刊更多论文

脊函数和神经网络注释

这些笔记是关于脊函数的。近年来，人们对这些功能产生了浓厚兴趣。脊函数出现在不同的领域，以不同的形式出现。它们出现在不同的领域，如偏微分方程(在那里它们被称为平面波)、计算机断层扫描和统计学。这些函数也是神经网络中许多中心模型的基础。我们从近似理论的角度对脊函数感兴趣。近似理论的基本目标是用较简单的对象来近似复杂的对象。在众多多元函数中，岭函数的线性组合是一类较简单的函数。这些笔记研究了用脊函数的线性组合逼近多元函数的一些问题。我们在这里给出这些函数的各种性质。我们提出的问题如下。什么时候多元函数可以表示为某一类脊函数的线性组合?这样的线性组合何时表示每个多元函数?如果一个精确的表示是不可能的，一个可以任意近似好吗?如果井近似失败，如何计算/估计近似的误差，知道存在最佳近似?如何描述和构建最佳近似?如果一个光滑函数是任意表现的脊函数的和，它可以被表示为光滑脊函数的和吗?我们还研究了广义脊函数的性质，它与线性叠加和著名的柯尔莫哥洛夫叠加定理密切相关。这些笔记以脊函数在具有有限权值集的单层和两层隐层神经网络逼近问题中的一些应用作为结束。我们希望这些笔记对研究人员和学生都有用和有趣。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

求助全文

约1分钟内获得全文求助全文

来源期刊

Mathematics eJournal

自引率

0.00%

发文量