Pasquale Cascarano , Giorgia Franchini , Erich Kobler , Federica Porta , Andrea Sebastiani
{"title":"可变度量近似随机梯度法:分类问题的应用","authors":"Pasquale Cascarano , Giorgia Franchini , Erich Kobler , Federica Porta , Andrea Sebastiani","doi":"10.1016/j.ejco.2024.100088","DOIUrl":null,"url":null,"abstract":"<div><p>Due to the continued success of machine learning and deep learning in particular, supervised classification problems are ubiquitous in numerous scientific fields. Training these models typically involves the minimization of the empirical risk over large data sets along with a possibly non-differentiable regularization. In this paper, we introduce a stochastic gradient method for the considered classification problem. To control the variance of the objective's gradients, we use an automatic sample size selection along with a variable metric to precondition the stochastic gradient directions. Further, we utilize a non-monotone line search to automatize step size selection. Convergence results are provided for both convex and non-convex objective functions. Extensive numerical experiments verify that the suggested approach performs on par with state-of-the-art methods for training both statistical models for binary classification and artificial neural networks for multi-class image classification. The code is publicly available at <span>https://github.com/koblererich/lisavm</span><svg><path></path></svg>.</p></div>","PeriodicalId":51880,"journal":{"name":"EURO Journal on Computational Optimization","volume":"12 ","pages":"Article 100088"},"PeriodicalIF":2.6000,"publicationDate":"2024-01-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.sciencedirect.com/science/article/pii/S2192440624000054/pdfft?md5=738f38c0990532c2e5eec98c42a34bd4&pid=1-s2.0-S2192440624000054-main.pdf","citationCount":"0","resultStr":"{\"title\":\"A variable metric proximal stochastic gradient method: An application to classification problems\",\"authors\":\"Pasquale Cascarano , Giorgia Franchini , Erich Kobler , Federica Porta , Andrea Sebastiani\",\"doi\":\"10.1016/j.ejco.2024.100088\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"<div><p>Due to the continued success of machine learning and deep learning in particular, supervised classification problems are ubiquitous in numerous scientific fields. Training these models typically involves the minimization of the empirical risk over large data sets along with a possibly non-differentiable regularization. In this paper, we introduce a stochastic gradient method for the considered classification problem. To control the variance of the objective's gradients, we use an automatic sample size selection along with a variable metric to precondition the stochastic gradient directions. Further, we utilize a non-monotone line search to automatize step size selection. Convergence results are provided for both convex and non-convex objective functions. Extensive numerical experiments verify that the suggested approach performs on par with state-of-the-art methods for training both statistical models for binary classification and artificial neural networks for multi-class image classification. The code is publicly available at <span>https://github.com/koblererich/lisavm</span><svg><path></path></svg>.</p></div>\",\"PeriodicalId\":51880,\"journal\":{\"name\":\"EURO Journal on Computational Optimization\",\"volume\":\"12 \",\"pages\":\"Article 100088\"},\"PeriodicalIF\":2.6000,\"publicationDate\":\"2024-01-01\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"https://www.sciencedirect.com/science/article/pii/S2192440624000054/pdfft?md5=738f38c0990532c2e5eec98c42a34bd4&pid=1-s2.0-S2192440624000054-main.pdf\",\"citationCount\":\"0\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"EURO Journal on Computational Optimization\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://www.sciencedirect.com/science/article/pii/S2192440624000054\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"Q2\",\"JCRName\":\"OPERATIONS RESEARCH & MANAGEMENT SCIENCE\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"EURO Journal on Computational Optimization","FirstCategoryId":"1085","ListUrlMain":"https://www.sciencedirect.com/science/article/pii/S2192440624000054","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q2","JCRName":"OPERATIONS RESEARCH & MANAGEMENT SCIENCE","Score":null,"Total":0}
A variable metric proximal stochastic gradient method: An application to classification problems
Due to the continued success of machine learning and deep learning in particular, supervised classification problems are ubiquitous in numerous scientific fields. Training these models typically involves the minimization of the empirical risk over large data sets along with a possibly non-differentiable regularization. In this paper, we introduce a stochastic gradient method for the considered classification problem. To control the variance of the objective's gradients, we use an automatic sample size selection along with a variable metric to precondition the stochastic gradient directions. Further, we utilize a non-monotone line search to automatize step size selection. Convergence results are provided for both convex and non-convex objective functions. Extensive numerical experiments verify that the suggested approach performs on par with state-of-the-art methods for training both statistical models for binary classification and artificial neural networks for multi-class image classification. The code is publicly available at https://github.com/koblererich/lisavm.
期刊介绍:
The aim of this journal is to contribute to the many areas in which Operations Research and Computer Science are tightly connected with each other. More precisely, the common element in all contributions to this journal is the use of computers for the solution of optimization problems. Both methodological contributions and innovative applications are considered, but validation through convincing computational experiments is desirable. The journal publishes three types of articles (i) research articles, (ii) tutorials, and (iii) surveys. A research article presents original methodological contributions. A tutorial provides an introduction to an advanced topic designed to ease the use of the relevant methodology. A survey provides a wide overview of a given subject by summarizing and organizing research results.