Samet Oymak, Zalan Fabian, Mingchen Li, M. Soltanolkotabi
{"title":"神经网络的泛化、自适应与低秩表示","authors":"Samet Oymak, Zalan Fabian, Mingchen Li, M. Soltanolkotabi","doi":"10.1109/IEEECONF44664.2019.9048845","DOIUrl":null,"url":null,"abstract":"We develop a data-dependent optimization and generalization theory for neural networks which leverages the lowrankness of the Jacobian matrix associated with the network. Our results help demystify why training and generalization is easier on clean and structured datasets and harder on noisy and unstructured datasets. Specifically, we show that over the principal eigendirections of the Jacobian matrix space learning is fast and one can quickly train a model with zero training loss that can also generalize well. Over the smaller eigendirections, training is slower and early stopping can help with generalization at the expense of some bias. We also discuss how neural networks can learn better representations over time in terms of the Jacobian mapping. We conduct various numerical experiments on deep networks that corroborate our theoretical findings and demonstrate that: (i) the Jacobian of typical neural networks exhibit low-rank structure with a few large singular values and many small ones, (ii) most of the useful label information lies on the principal eigendirections where learning is fast, and (iii) Jacobian adapts over time and learn better representations.","PeriodicalId":6684,"journal":{"name":"2019 53rd Asilomar Conference on Signals, Systems, and Computers","volume":"9 1","pages":"581-585"},"PeriodicalIF":0.0000,"publicationDate":"2019-11-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"4","resultStr":"{\"title\":\"Generalization, Adaptation and Low-Rank Representation in Neural Networks\",\"authors\":\"Samet Oymak, Zalan Fabian, Mingchen Li, M. Soltanolkotabi\",\"doi\":\"10.1109/IEEECONF44664.2019.9048845\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"We develop a data-dependent optimization and generalization theory for neural networks which leverages the lowrankness of the Jacobian matrix associated with the network. Our results help demystify why training and generalization is easier on clean and structured datasets and harder on noisy and unstructured datasets. Specifically, we show that over the principal eigendirections of the Jacobian matrix space learning is fast and one can quickly train a model with zero training loss that can also generalize well. Over the smaller eigendirections, training is slower and early stopping can help with generalization at the expense of some bias. We also discuss how neural networks can learn better representations over time in terms of the Jacobian mapping. We conduct various numerical experiments on deep networks that corroborate our theoretical findings and demonstrate that: (i) the Jacobian of typical neural networks exhibit low-rank structure with a few large singular values and many small ones, (ii) most of the useful label information lies on the principal eigendirections where learning is fast, and (iii) Jacobian adapts over time and learn better representations.\",\"PeriodicalId\":6684,\"journal\":{\"name\":\"2019 53rd Asilomar Conference on Signals, Systems, and Computers\",\"volume\":\"9 1\",\"pages\":\"581-585\"},\"PeriodicalIF\":0.0000,\"publicationDate\":\"2019-11-01\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"4\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"2019 53rd Asilomar Conference on Signals, Systems, and Computers\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/10.1109/IEEECONF44664.2019.9048845\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"2019 53rd Asilomar Conference on Signals, Systems, and Computers","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/IEEECONF44664.2019.9048845","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
Generalization, Adaptation and Low-Rank Representation in Neural Networks
We develop a data-dependent optimization and generalization theory for neural networks which leverages the lowrankness of the Jacobian matrix associated with the network. Our results help demystify why training and generalization is easier on clean and structured datasets and harder on noisy and unstructured datasets. Specifically, we show that over the principal eigendirections of the Jacobian matrix space learning is fast and one can quickly train a model with zero training loss that can also generalize well. Over the smaller eigendirections, training is slower and early stopping can help with generalization at the expense of some bias. We also discuss how neural networks can learn better representations over time in terms of the Jacobian mapping. We conduct various numerical experiments on deep networks that corroborate our theoretical findings and demonstrate that: (i) the Jacobian of typical neural networks exhibit low-rank structure with a few large singular values and many small ones, (ii) most of the useful label information lies on the principal eigendirections where learning is fast, and (iii) Jacobian adapts over time and learn better representations.