{"title":"Alternating Transfer Functions to Prevent Overfitting in Non-Linear Regression with Neural Networks","authors":"Philipp Seitz, Jan Schmitt","doi":"10.1080/0952813x.2023.2270995","DOIUrl":null,"url":null,"abstract":"In nonlinear regression with machine learning methods, neural networks (NNs) are ideally suited due to their universal approximation property, which states that arbitrary nonlinear functions can thereby be approximated arbitrarily well. Unfortunately, this property also poses the problem that data points with measurement errors can be approximated too well and unknown parameter subspaces in the estimation can deviate far from the actual value (so-called overfitting). Various developed methods aim to reduce overfitting through modifications in several areas of the training. In this work, we pursue the question of how an NN behaves in training with respect to overfitting when linear and nonlinear transfer functions (TF) are alternated in different hidden layers (HL). The presented approach is applied to a generated dataset and contrasted to established methods from the literature, both individually and in combination. Comparable results are obtained, whereby the common use of purely nonlinear transfer functions proves to be not recommended generally.","PeriodicalId":133720,"journal":{"name":"Journal of Experimental and Theoretical Artificial Intelligence","volume":"885 ","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2023-10-31","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Journal of Experimental and Theoretical Artificial Intelligence","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1080/0952813x.2023.2270995","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 0
Abstract
In nonlinear regression with machine learning methods, neural networks (NNs) are ideally suited due to their universal approximation property, which states that arbitrary nonlinear functions can thereby be approximated arbitrarily well. Unfortunately, this property also poses the problem that data points with measurement errors can be approximated too well and unknown parameter subspaces in the estimation can deviate far from the actual value (so-called overfitting). Various developed methods aim to reduce overfitting through modifications in several areas of the training. In this work, we pursue the question of how an NN behaves in training with respect to overfitting when linear and nonlinear transfer functions (TF) are alternated in different hidden layers (HL). The presented approach is applied to a generated dataset and contrasted to established methods from the literature, both individually and in combination. Comparable results are obtained, whereby the common use of purely nonlinear transfer functions proves to be not recommended generally.