Limited-memory Common-directions Method With Subsampled Newton Directions for Large-scale Linear Classification

2021 IEEE International Conference on Data Mining (ICDM) Pub Date : 2021-12-01 DOI:10.1109/ICDM51629.2021.00188

Jui-Nan Yen, Chih-Jen Lin

{"title":"Limited-memory Common-directions Method With Subsampled Newton Directions for Large-scale Linear Classification","authors":"Jui-Nan Yen, Chih-Jen Lin","doi":"10.1109/ICDM51629.2021.00188","DOIUrl":null,"url":null,"abstract":"The common-directions method is an optimization method recently proposed to utilize second-order information. It is especially efficient on large-scale linear classification problems, and it is competitive with state-of-the-art optimization methods like BFGS, LBFGS, and Nesterov’s accelerated gradient method. The main idea of the method is to minimize the local quadratic approximation within the selected subspace. Regarding the selection of the subspace, the original authors only focused on the span of current and past gradient directions. In this work, we analyze the impact of subspace selection, and point out that the lack of direction diversity can be a potential weakness for using gradients as directions. To address this problem, we propose the use of subsampled Newton directions, which always possess diversity unless they are already close to the true Newton direction. Our experiments on large-scale linear classification problems show that our proposed methods are generally better than subsampled Newton methods and the original common-directions method.","PeriodicalId":320970,"journal":{"name":"2021 IEEE International Conference on Data Mining (ICDM)","volume":"2014 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2021-12-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"2021 IEEE International Conference on Data Mining (ICDM)","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/ICDM51629.2021.00188","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}

引用次数: 0

Abstract

The common-directions method is an optimization method recently proposed to utilize second-order information. It is especially efficient on large-scale linear classification problems, and it is competitive with state-of-the-art optimization methods like BFGS, LBFGS, and Nesterov’s accelerated gradient method. The main idea of the method is to minimize the local quadratic approximation within the selected subspace. Regarding the selection of the subspace, the original authors only focused on the span of current and past gradient directions. In this work, we analyze the impact of subspace selection, and point out that the lack of direction diversity can be a potential weakness for using gradients as directions. To address this problem, we propose the use of subsampled Newton directions, which always possess diversity unless they are already close to the true Newton direction. Our experiments on large-scale linear classification problems show that our proposed methods are generally better than subsampled Newton methods and the original common-directions method.

查看原文本刊更多论文

利用子采样牛顿方向的有限内存公共方向法进行大规模线性分类

共同方向法是最近提出的一种利用二阶信息的优化方法。它在大规模线性分类问题上特别有效，与 BFGS、LBFGS 和内斯特洛夫加速梯度法等最先进的优化方法相比，具有很强的竞争力。该方法的主要思想是最小化所选子空间内的局部二次逼近。关于子空间的选择，原作者只关注当前和过去梯度方向的跨度。在这项工作中，我们分析了子空间选择的影响，并指出缺乏方向多样性可能是使用梯度作为方向的潜在弱点。为了解决这个问题，我们建议使用子采样牛顿方向，除非它们已经接近真正的牛顿方向，否则总是具有多样性。我们在大规模线性分类问题上的实验表明，我们提出的方法总体上优于子采样牛顿方法和原始的公共方向方法。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

求助全文

约1分钟内获得全文求助全文

来源期刊

2021 IEEE International Conference on Data Mining (ICDM)

自引率

0.00%

发文量