Limited-memory Common-directions Method With Subsampled Newton Directions for Large-scale Linear Classification

Jui-Nan Yen, Chih-Jen Lin
{"title":"Limited-memory Common-directions Method With Subsampled Newton Directions for Large-scale Linear Classification","authors":"Jui-Nan Yen, Chih-Jen Lin","doi":"10.1109/ICDM51629.2021.00188","DOIUrl":null,"url":null,"abstract":"The common-directions method is an optimization method recently proposed to utilize second-order information. It is especially efficient on large-scale linear classification problems, and it is competitive with state-of-the-art optimization methods like BFGS, LBFGS, and Nesterov’s accelerated gradient method. The main idea of the method is to minimize the local quadratic approximation within the selected subspace. Regarding the selection of the subspace, the original authors only focused on the span of current and past gradient directions. In this work, we analyze the impact of subspace selection, and point out that the lack of direction diversity can be a potential weakness for using gradients as directions. To address this problem, we propose the use of subsampled Newton directions, which always possess diversity unless they are already close to the true Newton direction. Our experiments on large-scale linear classification problems show that our proposed methods are generally better than subsampled Newton methods and the original common-directions method.","PeriodicalId":320970,"journal":{"name":"2021 IEEE International Conference on Data Mining (ICDM)","volume":"2014 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2021-12-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"2021 IEEE International Conference on Data Mining (ICDM)","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/ICDM51629.2021.00188","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 0

Abstract

The common-directions method is an optimization method recently proposed to utilize second-order information. It is especially efficient on large-scale linear classification problems, and it is competitive with state-of-the-art optimization methods like BFGS, LBFGS, and Nesterov’s accelerated gradient method. The main idea of the method is to minimize the local quadratic approximation within the selected subspace. Regarding the selection of the subspace, the original authors only focused on the span of current and past gradient directions. In this work, we analyze the impact of subspace selection, and point out that the lack of direction diversity can be a potential weakness for using gradients as directions. To address this problem, we propose the use of subsampled Newton directions, which always possess diversity unless they are already close to the true Newton direction. Our experiments on large-scale linear classification problems show that our proposed methods are generally better than subsampled Newton methods and the original common-directions method.
利用子采样牛顿方向的有限内存公共方向法进行大规模线性分类
共同方向法是最近提出的一种利用二阶信息的优化方法。它在大规模线性分类问题上特别有效,与 BFGS、LBFGS 和内斯特洛夫加速梯度法等最先进的优化方法相比,具有很强的竞争力。该方法的主要思想是最小化所选子空间内的局部二次逼近。关于子空间的选择,原作者只关注当前和过去梯度方向的跨度。在这项工作中,我们分析了子空间选择的影响,并指出缺乏方向多样性可能是使用梯度作为方向的潜在弱点。为了解决这个问题,我们建议使用子采样牛顿方向,除非它们已经接近真正的牛顿方向,否则总是具有多样性。我们在大规模线性分类问题上的实验表明,我们提出的方法总体上优于子采样牛顿方法和原始的公共方向方法。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
求助全文
约1分钟内获得全文 求助全文
来源期刊
自引率
0.00%
发文量
0
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
确定
请完成安全验证×
copy
已复制链接
快去分享给好友吧!
我知道了
右上角分享
点击右上角分享
0
联系我们:info@booksci.cn Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。 Copyright © 2023 布克学术 All rights reserved.
京ICP备2023020795号-1
ghs 京公网安备 11010802042870号
Book学术文献互助
Book学术文献互助群
群 号:604180095
Book学术官方微信