{"title":"Unraveling the Hessian: A Key to Smooth Convergence in Loss Function Landscapes","authors":"Nikita Kiselev, Andrey Grabovoy","doi":"arxiv-2409.11995","DOIUrl":null,"url":null,"abstract":"The loss landscape of neural networks is a critical aspect of their training,\nand understanding its properties is essential for improving their performance.\nIn this paper, we investigate how the loss surface changes when the sample size\nincreases, a previously unexplored issue. We theoretically analyze the\nconvergence of the loss landscape in a fully connected neural network and\nderive upper bounds for the difference in loss function values when adding a\nnew object to the sample. Our empirical study confirms these results on various\ndatasets, demonstrating the convergence of the loss function surface for image\nclassification tasks. Our findings provide insights into the local geometry of\nneural loss landscapes and have implications for the development of sample size\ndetermination techniques.","PeriodicalId":501301,"journal":{"name":"arXiv - CS - Machine Learning","volume":null,"pages":null},"PeriodicalIF":0.0000,"publicationDate":"2024-09-18","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"arXiv - CS - Machine Learning","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/arxiv-2409.11995","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 0
Abstract
The loss landscape of neural networks is a critical aspect of their training,
and understanding its properties is essential for improving their performance.
In this paper, we investigate how the loss surface changes when the sample size
increases, a previously unexplored issue. We theoretically analyze the
convergence of the loss landscape in a fully connected neural network and
derive upper bounds for the difference in loss function values when adding a
new object to the sample. Our empirical study confirms these results on various
datasets, demonstrating the convergence of the loss function surface for image
classification tasks. Our findings provide insights into the local geometry of
neural loss landscapes and have implications for the development of sample size
determination techniques.