稀疏最优评分和判别分析的近似方法

IF 1.4 4区计算机科学 Q2 STATISTICS & PROBABILITY

Advances in Data Analysis and Classification Pub Date : 2022-12-21 DOI:10.1007/s11634-022-00530-6

Summer Atkins, Gudmundur Einarsson, Line Clemmensen, Brendan Ames

{"title":"稀疏最优评分和判别分析的近似方法","authors":"Summer Atkins, Gudmundur Einarsson, Line Clemmensen, Brendan Ames","doi":"10.1007/s11634-022-00530-6","DOIUrl":null,"url":null,"abstract":"<div><p>Linear discriminant analysis (LDA) is a classical method for dimensionality reduction, where discriminant vectors are sought to project data to a lower dimensional space for optimal separability of classes. Several recent papers have outlined strategies, based on exploiting sparsity of the discriminant vectors, for performing LDA in the high-dimensional setting where the number of features exceeds the number of observations in the data. However, many of these proposed methods lack scalable methods for solution of the underlying optimization problems. We consider an optimization scheme for solving the sparse optimal scoring formulation of LDA based on block coordinate descent. Each iteration of this algorithm requires an update of a scoring vector, which admits an analytic formula, and an update of the corresponding discriminant vector, which requires solution of a convex subproblem; we will propose several variants of this algorithm where the proximal gradient method or the alternating direction method of multipliers is used to solve this subproblem. We show that the per-iteration cost of these methods scales linearly in the dimension of the data provided restricted regularization terms are employed, and cubically in the dimension of the data in the worst case. Furthermore, we establish that when this block coordinate descent framework generates convergent subsequences of iterates, then these subsequences converge to the stationary points of the sparse optimal scoring problem. We demonstrate the effectiveness of our new methods with empirical results for classification of Gaussian data and data sets drawn from benchmarking repositories, including time-series and multispectral X-ray data, and provide <span>Matlab</span> and <span>R</span> implementations of our optimization schemes.</p></div>","PeriodicalId":49270,"journal":{"name":"Advances in Data Analysis and Classification","volume":"17 4","pages":"983 - 1036"},"PeriodicalIF":1.4000,"publicationDate":"2022-12-21","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"2","resultStr":"{\"title\":\"Proximal methods for sparse optimal scoring and discriminant analysis\",\"authors\":\"Summer Atkins, Gudmundur Einarsson, Line Clemmensen, Brendan Ames\",\"doi\":\"10.1007/s11634-022-00530-6\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"<div><p>Linear discriminant analysis (LDA) is a classical method for dimensionality reduction, where discriminant vectors are sought to project data to a lower dimensional space for optimal separability of classes. Several recent papers have outlined strategies, based on exploiting sparsity of the discriminant vectors, for performing LDA in the high-dimensional setting where the number of features exceeds the number of observations in the data. However, many of these proposed methods lack scalable methods for solution of the underlying optimization problems. We consider an optimization scheme for solving the sparse optimal scoring formulation of LDA based on block coordinate descent. Each iteration of this algorithm requires an update of a scoring vector, which admits an analytic formula, and an update of the corresponding discriminant vector, which requires solution of a convex subproblem; we will propose several variants of this algorithm where the proximal gradient method or the alternating direction method of multipliers is used to solve this subproblem. We show that the per-iteration cost of these methods scales linearly in the dimension of the data provided restricted regularization terms are employed, and cubically in the dimension of the data in the worst case. Furthermore, we establish that when this block coordinate descent framework generates convergent subsequences of iterates, then these subsequences converge to the stationary points of the sparse optimal scoring problem. We demonstrate the effectiveness of our new methods with empirical results for classification of Gaussian data and data sets drawn from benchmarking repositories, including time-series and multispectral X-ray data, and provide <span>Matlab</span> and <span>R</span> implementations of our optimization schemes.</p></div>\",\"PeriodicalId\":49270,\"journal\":{\"name\":\"Advances in Data Analysis and Classification\",\"volume\":\"17 4\",\"pages\":\"983 - 1036\"},\"PeriodicalIF\":1.4000,\"publicationDate\":\"2022-12-21\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"2\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"Advances in Data Analysis and Classification\",\"FirstCategoryId\":\"94\",\"ListUrlMain\":\"https://link.springer.com/article/10.1007/s11634-022-00530-6\",\"RegionNum\":4,\"RegionCategory\":\"计算机科学\",\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"Q2\",\"JCRName\":\"STATISTICS & PROBABILITY\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"Advances in Data Analysis and Classification","FirstCategoryId":"94","ListUrlMain":"https://link.springer.com/article/10.1007/s11634-022-00530-6","RegionNum":4,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q2","JCRName":"STATISTICS & PROBABILITY","Score":null,"Total":0}

引用次数: 2

摘要

线性判别分析（LDA）是一种经典的降维方法，其中寻求判别向量来将数据投影到较低维空间，以实现类的最佳可分性。最近的几篇论文概述了基于利用判别向量的稀疏性的策略，用于在高维环境中执行LDA，其中特征的数量超过了数据中的观测数量。然而，这些提出的方法中的许多缺乏用于解决潜在优化问题的可扩展方法。我们考虑了一种基于块坐标下降的LDA稀疏最优评分公式的优化方案。该算法的每次迭代都需要更新评分向量，该向量允许分析公式，并更新相应的判别向量，该判别向量需要求解凸子问题；我们将提出该算法的几种变体，其中使用近梯度法或乘法器的交替方向法来解决该子问题。我们证明了这些方法的每次迭代成本在所提供的数据维度上是线性的，在最坏的情况下，在数据维度上使用了限制正则化项。此外，我们建立了当这个块坐标下降框架生成迭代的收敛子序列时，这些子序列收敛到稀疏最优评分问题的平稳点。我们通过对高斯数据和从基准存储库中提取的数据集（包括时间序列和多光谱X射线数据）进行分类的经验结果证明了我们新方法的有效性，并提供了我们优化方案的Matlab和R实现。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

Proximal methods for sparse optimal scoring and discriminant analysis

查看原文本刊更多论文

Proximal methods for sparse optimal scoring and discriminant analysis

Linear discriminant analysis (LDA) is a classical method for dimensionality reduction, where discriminant vectors are sought to project data to a lower dimensional space for optimal separability of classes. Several recent papers have outlined strategies, based on exploiting sparsity of the discriminant vectors, for performing LDA in the high-dimensional setting where the number of features exceeds the number of observations in the data. However, many of these proposed methods lack scalable methods for solution of the underlying optimization problems. We consider an optimization scheme for solving the sparse optimal scoring formulation of LDA based on block coordinate descent. Each iteration of this algorithm requires an update of a scoring vector, which admits an analytic formula, and an update of the corresponding discriminant vector, which requires solution of a convex subproblem; we will propose several variants of this algorithm where the proximal gradient method or the alternating direction method of multipliers is used to solve this subproblem. We show that the per-iteration cost of these methods scales linearly in the dimension of the data provided restricted regularization terms are employed, and cubically in the dimension of the data in the worst case. Furthermore, we establish that when this block coordinate descent framework generates convergent subsequences of iterates, then these subsequences converge to the stationary points of the sparse optimal scoring problem. We demonstrate the effectiveness of our new methods with empirical results for classification of Gaussian data and data sets drawn from benchmarking repositories, including time-series and multispectral X-ray data, and provide Matlab and R implementations of our optimization schemes.

求助全文

通过发布文献求助，成功后即可免费获取论文全文。去求助

来源期刊

Advances in Data Analysis and Classification STATISTICS & PROBABILITY-

CiteScore

3.40

自引率

6.20%

发文量

审稿时长

>12 weeks

期刊介绍： The international journal Advances in Data Analysis and Classification (ADAC) is designed as a forum for high standard publications on research and applications concerning the extraction of knowable aspects from many types of data. It publishes articles on such topics as structural, quantitative, or statistical approaches for the analysis of data; advances in classification, clustering, and pattern recognition methods; strategies for modeling complex data and mining large data sets; methods for the extraction of knowledge from data, and applications of advanced methods in specific domains of practice. Articles illustrate how new domain-specific knowledge can be made available from data by skillful use of data analysis methods. The journal also publishes survey papers that outline, and illuminate the basic ideas and techniques of special approaches.