浅神经网络的凸优化

2019 57th Annual Allerton Conference on Communication, Control, and Computing (Allerton) Pub Date : 2019-09-01 DOI:10.1109/ALLERTON.2019.8919769

Tolga Ergen, Mert Pilanci

{"title":"浅神经网络的凸优化","authors":"Tolga Ergen, Mert Pilanci","doi":"10.1109/ALLERTON.2019.8919769","DOIUrl":null,"url":null,"abstract":"We consider non-convex training of shallow neural networks and introduce a convex relaxation approach with theoretical guarantees. For the single neuron case, we prove that the relaxation preserves the location of the global minimum under a planted model assumption. Therefore, a globally optimal solution can be efficiently found via a gradient method. We show that gradient descent applied on the relaxation always outperforms gradient descent on the original non-convex loss with no additional computational cost. We then characterize this relaxation as a regularizer and further introduce extensions to multineuron single hidden layer networks.","PeriodicalId":120479,"journal":{"name":"2019 57th Annual Allerton Conference on Communication, Control, and Computing (Allerton)","volume":"23 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2019-09-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"14","resultStr":"{\"title\":\"Convex Optimization for Shallow Neural Networks\",\"authors\":\"Tolga Ergen, Mert Pilanci\",\"doi\":\"10.1109/ALLERTON.2019.8919769\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"We consider non-convex training of shallow neural networks and introduce a convex relaxation approach with theoretical guarantees. For the single neuron case, we prove that the relaxation preserves the location of the global minimum under a planted model assumption. Therefore, a globally optimal solution can be efficiently found via a gradient method. We show that gradient descent applied on the relaxation always outperforms gradient descent on the original non-convex loss with no additional computational cost. We then characterize this relaxation as a regularizer and further introduce extensions to multineuron single hidden layer networks.\",\"PeriodicalId\":120479,\"journal\":{\"name\":\"2019 57th Annual Allerton Conference on Communication, Control, and Computing (Allerton)\",\"volume\":\"23 1\",\"pages\":\"0\"},\"PeriodicalIF\":0.0000,\"publicationDate\":\"2019-09-01\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"14\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"2019 57th Annual Allerton Conference on Communication, Control, and Computing (Allerton)\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/10.1109/ALLERTON.2019.8919769\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"2019 57th Annual Allerton Conference on Communication, Control, and Computing (Allerton)","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/ALLERTON.2019.8919769","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}

引用次数: 14

摘要

我们考虑浅神经网络的非凸训练，并引入一种具有理论保证的凸松弛方法。对于单神经元情况，我们证明了在种植模型假设下松弛保留了全局最小值的位置。因此，利用梯度法可以有效地求出全局最优解。我们证明，在没有额外计算成本的情况下，应用于松弛的梯度下降总是优于应用于原始非凸损失的梯度下降。然后，我们将这种松弛描述为正则化器，并进一步引入扩展到多神经元单隐藏层网络。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

查看原文本刊更多论文

Convex Optimization for Shallow Neural Networks

We consider non-convex training of shallow neural networks and introduce a convex relaxation approach with theoretical guarantees. For the single neuron case, we prove that the relaxation preserves the location of the global minimum under a planted model assumption. Therefore, a globally optimal solution can be efficiently found via a gradient method. We show that gradient descent applied on the relaxation always outperforms gradient descent on the original non-convex loss with no additional computational cost. We then characterize this relaxation as a regularizer and further introduce extensions to multineuron single hidden layer networks.

求助全文

通过发布文献求助，成功后即可免费获取论文全文。去求助

来源期刊

2019 57th Annual Allerton Conference on Communication, Control, and Computing (Allerton)

自引率

0.00%

发文量