可变度量近似随机梯度法:分类问题的应用

IF 2.6 Q2 OPERATIONS RESEARCH & MANAGEMENT SCIENCE
Pasquale Cascarano , Giorgia Franchini , Erich Kobler , Federica Porta , Andrea Sebastiani
{"title":"可变度量近似随机梯度法:分类问题的应用","authors":"Pasquale Cascarano ,&nbsp;Giorgia Franchini ,&nbsp;Erich Kobler ,&nbsp;Federica Porta ,&nbsp;Andrea Sebastiani","doi":"10.1016/j.ejco.2024.100088","DOIUrl":null,"url":null,"abstract":"<div><p>Due to the continued success of machine learning and deep learning in particular, supervised classification problems are ubiquitous in numerous scientific fields. Training these models typically involves the minimization of the empirical risk over large data sets along with a possibly non-differentiable regularization. In this paper, we introduce a stochastic gradient method for the considered classification problem. To control the variance of the objective's gradients, we use an automatic sample size selection along with a variable metric to precondition the stochastic gradient directions. Further, we utilize a non-monotone line search to automatize step size selection. Convergence results are provided for both convex and non-convex objective functions. Extensive numerical experiments verify that the suggested approach performs on par with state-of-the-art methods for training both statistical models for binary classification and artificial neural networks for multi-class image classification. The code is publicly available at <span>https://github.com/koblererich/lisavm</span><svg><path></path></svg>.</p></div>","PeriodicalId":51880,"journal":{"name":"EURO Journal on Computational Optimization","volume":"12 ","pages":"Article 100088"},"PeriodicalIF":2.6000,"publicationDate":"2024-01-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.sciencedirect.com/science/article/pii/S2192440624000054/pdfft?md5=738f38c0990532c2e5eec98c42a34bd4&pid=1-s2.0-S2192440624000054-main.pdf","citationCount":"0","resultStr":"{\"title\":\"A variable metric proximal stochastic gradient method: An application to classification problems\",\"authors\":\"Pasquale Cascarano ,&nbsp;Giorgia Franchini ,&nbsp;Erich Kobler ,&nbsp;Federica Porta ,&nbsp;Andrea Sebastiani\",\"doi\":\"10.1016/j.ejco.2024.100088\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"<div><p>Due to the continued success of machine learning and deep learning in particular, supervised classification problems are ubiquitous in numerous scientific fields. Training these models typically involves the minimization of the empirical risk over large data sets along with a possibly non-differentiable regularization. In this paper, we introduce a stochastic gradient method for the considered classification problem. To control the variance of the objective's gradients, we use an automatic sample size selection along with a variable metric to precondition the stochastic gradient directions. Further, we utilize a non-monotone line search to automatize step size selection. Convergence results are provided for both convex and non-convex objective functions. Extensive numerical experiments verify that the suggested approach performs on par with state-of-the-art methods for training both statistical models for binary classification and artificial neural networks for multi-class image classification. The code is publicly available at <span>https://github.com/koblererich/lisavm</span><svg><path></path></svg>.</p></div>\",\"PeriodicalId\":51880,\"journal\":{\"name\":\"EURO Journal on Computational Optimization\",\"volume\":\"12 \",\"pages\":\"Article 100088\"},\"PeriodicalIF\":2.6000,\"publicationDate\":\"2024-01-01\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"https://www.sciencedirect.com/science/article/pii/S2192440624000054/pdfft?md5=738f38c0990532c2e5eec98c42a34bd4&pid=1-s2.0-S2192440624000054-main.pdf\",\"citationCount\":\"0\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"EURO Journal on Computational Optimization\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://www.sciencedirect.com/science/article/pii/S2192440624000054\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"Q2\",\"JCRName\":\"OPERATIONS RESEARCH & MANAGEMENT SCIENCE\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"EURO Journal on Computational Optimization","FirstCategoryId":"1085","ListUrlMain":"https://www.sciencedirect.com/science/article/pii/S2192440624000054","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q2","JCRName":"OPERATIONS RESEARCH & MANAGEMENT SCIENCE","Score":null,"Total":0}
引用次数: 0

摘要

由于机器学习,特别是深度学习的不断成功,监督分类问题在众多科学领域无处不在。对这些模型的训练通常涉及对大型数据集的经验风险最小化,以及可能的无差别正则化。在本文中,我们针对所考虑的分类问题引入了一种随机梯度法。为了控制目标梯度的方差,我们使用了自动样本大小选择和可变度量来对随机梯度方向进行预处理。此外,我们还利用非单调线性搜索来自动选择步长。我们提供了凸性和非凸性目标函数的收敛结果。大量的数值实验证明,所建议的方法在训练二元分类统计模型和多类图像分类人工神经网络方面的表现与最先进的方法不相上下。代码可在 https://github.com/koblererich/lisavm 公开获取。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
A variable metric proximal stochastic gradient method: An application to classification problems

Due to the continued success of machine learning and deep learning in particular, supervised classification problems are ubiquitous in numerous scientific fields. Training these models typically involves the minimization of the empirical risk over large data sets along with a possibly non-differentiable regularization. In this paper, we introduce a stochastic gradient method for the considered classification problem. To control the variance of the objective's gradients, we use an automatic sample size selection along with a variable metric to precondition the stochastic gradient directions. Further, we utilize a non-monotone line search to automatize step size selection. Convergence results are provided for both convex and non-convex objective functions. Extensive numerical experiments verify that the suggested approach performs on par with state-of-the-art methods for training both statistical models for binary classification and artificial neural networks for multi-class image classification. The code is publicly available at https://github.com/koblererich/lisavm.

求助全文
通过发布文献求助,成功后即可免费获取论文全文。 去求助
来源期刊
EURO Journal on Computational Optimization
EURO Journal on Computational Optimization OPERATIONS RESEARCH & MANAGEMENT SCIENCE-
CiteScore
3.50
自引率
0.00%
发文量
28
审稿时长
60 days
期刊介绍: The aim of this journal is to contribute to the many areas in which Operations Research and Computer Science are tightly connected with each other. More precisely, the common element in all contributions to this journal is the use of computers for the solution of optimization problems. Both methodological contributions and innovative applications are considered, but validation through convincing computational experiments is desirable. The journal publishes three types of articles (i) research articles, (ii) tutorials, and (iii) surveys. A research article presents original methodological contributions. A tutorial provides an introduction to an advanced topic designed to ease the use of the relevant methodology. A survey provides a wide overview of a given subject by summarizing and organizing research results.
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
确定
请完成安全验证×
copy
已复制链接
快去分享给好友吧!
我知道了
右上角分享
点击右上角分享
0
联系我们:info@booksci.cn Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。 Copyright © 2023 布克学术 All rights reserved.
京ICP备2023020795号-1
ghs 京公网安备 11010802042870号
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术官方微信