A Scale Invariant Measure of Flatness for Deep Network Minima

Akshay Rangamani, Nam H. Nguyen, Abhishek Kumar, D. Phan, S. Chin, T. Tran
{"title":"A Scale Invariant Measure of Flatness for Deep Network Minima","authors":"Akshay Rangamani, Nam H. Nguyen, Abhishek Kumar, D. Phan, S. Chin, T. Tran","doi":"10.1109/ICASSP39728.2021.9413771","DOIUrl":null,"url":null,"abstract":"It has been empirically observed that the flatness of minima obtained from training deep networks seems to correlate with better generalization. However, for deep networks with positively homogeneous activations, most measures of flatness are not invariant to rescaling of the network parameters. This means that the measure of flatness can be made as small or as large as possible through rescaling, rendering the quantitative measures meaningless. In this paper we show that for deep networks with positively homogenous activations, these rescalings constitute equivalence relations, and that these equivalence relations induce a quotient manifold structure in the parameter space. Using an appropriate Riemannian metric, we propose a Hessian-based measure for flatness that is invariant to rescaling and perform simulations to empirically verify our claim. Finally we perform experiments to verify that our flatness measure correlates with generalization by using minibatch stochastic gradient descent with different batch sizes to find deep network minima with different generalization properties.","PeriodicalId":347060,"journal":{"name":"ICASSP 2021 - 2021 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP)","volume":null,"pages":null},"PeriodicalIF":0.0000,"publicationDate":"2021-06-06","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"ICASSP 2021 - 2021 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP)","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/ICASSP39728.2021.9413771","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 0

Abstract

It has been empirically observed that the flatness of minima obtained from training deep networks seems to correlate with better generalization. However, for deep networks with positively homogeneous activations, most measures of flatness are not invariant to rescaling of the network parameters. This means that the measure of flatness can be made as small or as large as possible through rescaling, rendering the quantitative measures meaningless. In this paper we show that for deep networks with positively homogenous activations, these rescalings constitute equivalence relations, and that these equivalence relations induce a quotient manifold structure in the parameter space. Using an appropriate Riemannian metric, we propose a Hessian-based measure for flatness that is invariant to rescaling and perform simulations to empirically verify our claim. Finally we perform experiments to verify that our flatness measure correlates with generalization by using minibatch stochastic gradient descent with different batch sizes to find deep network minima with different generalization properties.
深度网络最小值平坦度的尺度不变度量
从经验上观察到,从训练深度网络获得的最小值的平坦度似乎与更好的泛化有关。然而,对于具有正均匀激活的深度网络,大多数平坦度度量对网络参数的重新缩放不是不变的。这意味着可以通过重新缩放使平面度的度量尽可能小或尽可能大,从而使定量度量变得毫无意义。在本文中,我们证明了对于具有正齐次激活的深度网络,这些重标构成等价关系,并且这些等价关系在参数空间中推导出商流形结构。使用适当的黎曼度量,我们提出了一种基于黑森的平坦度度量,该度量对重新缩放是不变的,并进行模拟以经验验证我们的主张。最后,我们通过实验验证了我们的平坦度度量与泛化的相关性,使用不同批大小的小批随机梯度下降来寻找具有不同泛化性质的深度网络最小值。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
求助全文
约1分钟内获得全文 求助全文
来源期刊
自引率
0.00%
发文量
0
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
确定
请完成安全验证×
copy
已复制链接
快去分享给好友吧!
我知道了
右上角分享
点击右上角分享
0
联系我们:info@booksci.cn Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。 Copyright © 2023 布克学术 All rights reserved.
京ICP备2023020795号-1
ghs 京公网安备 11010802042870号
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术官方微信