过度参数化的深度神经网络使经验风险最小化，但泛化效果不佳

IF 1.5 2区数学 Q2 STATISTICS & PROBABILITY

Bernoulli Pub Date : 2021-11-01 DOI:10.3150/21-BEJ1323

M. Kohler, A. Krzyżak

{"title":"过度参数化的深度神经网络使经验风险最小化，但泛化效果不佳","authors":"M. Kohler, A. Krzyżak","doi":"10.3150/21-BEJ1323","DOIUrl":null,"url":null,"abstract":"Recently it was shown in several papers that backpropagation is able to find the global minimum of the empirical risk on the training data using over-parametrized deep neural networks. In this paper, a similar result is shown for deep neural networks with the sigmoidal squasher activation function in a regression setting, and a lower bound is presented which proves that these networks do not generalize well on a new data in the sense that networks which minimize the empirical risk do not achieve the optimal minimax rate of convergence for estimation of smooth regression functions.","PeriodicalId":55387,"journal":{"name":"Bernoulli","volume":"27 1","pages":"2564-2597"},"PeriodicalIF":1.5000,"publicationDate":"2021-11-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"11","resultStr":"{\"title\":\"Over-parametrized deep neural networks minimizing the empirical risk do not generalize well\",\"authors\":\"M. Kohler, A. Krzyżak\",\"doi\":\"10.3150/21-BEJ1323\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"Recently it was shown in several papers that backpropagation is able to find the global minimum of the empirical risk on the training data using over-parametrized deep neural networks. In this paper, a similar result is shown for deep neural networks with the sigmoidal squasher activation function in a regression setting, and a lower bound is presented which proves that these networks do not generalize well on a new data in the sense that networks which minimize the empirical risk do not achieve the optimal minimax rate of convergence for estimation of smooth regression functions.\",\"PeriodicalId\":55387,\"journal\":{\"name\":\"Bernoulli\",\"volume\":\"27 1\",\"pages\":\"2564-2597\"},\"PeriodicalIF\":1.5000,\"publicationDate\":\"2021-11-01\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"11\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"Bernoulli\",\"FirstCategoryId\":\"100\",\"ListUrlMain\":\"https://doi.org/10.3150/21-BEJ1323\",\"RegionNum\":2,\"RegionCategory\":\"数学\",\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"Q2\",\"JCRName\":\"STATISTICS & PROBABILITY\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"Bernoulli","FirstCategoryId":"100","ListUrlMain":"https://doi.org/10.3150/21-BEJ1323","RegionNum":2,"RegionCategory":"数学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q2","JCRName":"STATISTICS & PROBABILITY","Score":null,"Total":0}

引用次数: 11

摘要

最近有几篇论文表明，反向传播能够利用过参数化深度神经网络在训练数据上找到经验风险的全局最小值。在本文中，对于具有s型压碎激活函数的深度神经网络，在回归设置中得到了类似的结果，并给出了一个下界，证明这些网络在新数据上不能很好地泛化，即最小化经验风险的网络不能达到光滑回归函数估计的最优最小最大收敛速度。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

查看原文本刊更多论文

Over-parametrized deep neural networks minimizing the empirical risk do not generalize well

Recently it was shown in several papers that backpropagation is able to find the global minimum of the empirical risk on the training data using over-parametrized deep neural networks. In this paper, a similar result is shown for deep neural networks with the sigmoidal squasher activation function in a regression setting, and a lower bound is presented which proves that these networks do not generalize well on a new data in the sense that networks which minimize the empirical risk do not achieve the optimal minimax rate of convergence for estimation of smooth regression functions.

求助全文

通过发布文献求助，成功后即可免费获取论文全文。去求助

来源期刊

Bernoulli 数学-统计学与概率论

CiteScore

3.40

自引率

0.00%

发文量

116

审稿时长

6-12 weeks

期刊介绍： BERNOULLI is the journal of the Bernoulli Society for Mathematical Statistics and Probability, issued four times per year. The journal provides a comprehensive account of important developments in the fields of statistics and probability, offering an international forum for both theoretical and applied work. BERNOULLI will publish: Papers containing original and significant research contributions: with background, mathematical derivation and discussion of the results in suitable detail and, where appropriate, with discussion of interesting applications in relation to the methodology proposed. Papers of the following two types will also be considered for publication, provided they are judged to enhance the dissemination of research: Review papers which provide an integrated critical survey of some area of probability and statistics and discuss important recent developments. Scholarly written papers on some historical significant aspect of statistics and probability.