稀疏回归作为稀疏特征值问题

2008 Information Theory and Applications Workshop Pub Date : 2008-08-15 DOI:10.1109/ITA.2008.4601036

Baback Moghaddam, Amit Gruber, Yair Weiss, S. Avidan

{"title":"稀疏回归作为稀疏特征值问题","authors":"Baback Moghaddam, Amit Gruber, Yair Weiss, S. Avidan","doi":"10.1109/ITA.2008.4601036","DOIUrl":null,"url":null,"abstract":"We extend the l0-norm ldquosubspectralrdquo algorithms developed for sparse-LDA (Moghaddam, 2006) and sparse-PCA (Moghaddam, 2006) to more general quadratic costs such as MSE in linear (or kernel) regression. The resulting ldquosparse least squaresrdquo (SLS) problem is also NP-hard, by way of its equivalence to a rank-1 sparse eigenvalue problem. Specifically, for minimizing general quadratic cost functions we use a highly-efficient method for direct eigenvalue computation based on partitioned matrix inverse techniques that leads to times103 speed-ups over standard eigenvalue decomposition. This increased efficiency mitigates the O(n4) complexity that limited the previous algorithmspsila utility for high-dimensional problems. Moreover, the new computation prioritizes the role of the less-myopic backward elimination stage which becomes even more efficient than forward selection. Similarly, branch-and-bound search for exact sparse least squares (ESLS) also benefits from partitioned matrix techniques. Our greedy sparse least squares (GSLS) algorithm generalizes Natarajanpsilas algorithm (Natarajan, 1995) also known as order-recursive matching pursuit (ORMP). Specifically, the forward pass of GSLS is exactly equivalent to ORMP but is more efficient, and by including the backward pass, which only doubles the computation, we can achieve a lower MSE than ORMP. In experimental comparisons with LARS (Efron, 2004), forward-GSLS is shown to be not only more efficient and accurate but more flexible in terms of choice of regularization.","PeriodicalId":345196,"journal":{"name":"2008 Information Theory and Applications Workshop","volume":"1993 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2008-08-15","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"22","resultStr":"{\"title\":\"Sparse regression as a sparse eigenvalue problem\",\"authors\":\"Baback Moghaddam, Amit Gruber, Yair Weiss, S. Avidan\",\"doi\":\"10.1109/ITA.2008.4601036\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"We extend the l0-norm ldquosubspectralrdquo algorithms developed for sparse-LDA (Moghaddam, 2006) and sparse-PCA (Moghaddam, 2006) to more general quadratic costs such as MSE in linear (or kernel) regression. The resulting ldquosparse least squaresrdquo (SLS) problem is also NP-hard, by way of its equivalence to a rank-1 sparse eigenvalue problem. Specifically, for minimizing general quadratic cost functions we use a highly-efficient method for direct eigenvalue computation based on partitioned matrix inverse techniques that leads to times103 speed-ups over standard eigenvalue decomposition. This increased efficiency mitigates the O(n4) complexity that limited the previous algorithmspsila utility for high-dimensional problems. Moreover, the new computation prioritizes the role of the less-myopic backward elimination stage which becomes even more efficient than forward selection. Similarly, branch-and-bound search for exact sparse least squares (ESLS) also benefits from partitioned matrix techniques. Our greedy sparse least squares (GSLS) algorithm generalizes Natarajanpsilas algorithm (Natarajan, 1995) also known as order-recursive matching pursuit (ORMP). Specifically, the forward pass of GSLS is exactly equivalent to ORMP but is more efficient, and by including the backward pass, which only doubles the computation, we can achieve a lower MSE than ORMP. In experimental comparisons with LARS (Efron, 2004), forward-GSLS is shown to be not only more efficient and accurate but more flexible in terms of choice of regularization.\",\"PeriodicalId\":345196,\"journal\":{\"name\":\"2008 Information Theory and Applications Workshop\",\"volume\":\"1993 1\",\"pages\":\"0\"},\"PeriodicalIF\":0.0000,\"publicationDate\":\"2008-08-15\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"22\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"2008 Information Theory and Applications Workshop\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/10.1109/ITA.2008.4601036\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"2008 Information Theory and Applications Workshop","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/ITA.2008.4601036","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}

引用次数: 22

摘要

我们将为稀疏lda (Moghaddam, 2006)和稀疏pca (Moghaddam, 2006)开发的10范数ldquosubspectralrdquo算法扩展到更一般的二次成本，如线性(或核)回归中的MSE。所得到的ldquosparse least squares (SLS)问题也是NP-hard，因为它等价于一个秩1稀疏特征值问题。具体来说，为了最小化一般的二次成本函数，我们使用了一种基于分块矩阵逆技术的直接特征值计算的高效方法，这种方法比标准特征值分解的速度提高了103倍。这种提高的效率降低了O(n4)的复杂性，这种复杂性限制了以前的算法在高维问题上的实用性。此外，新的计算方法优先考虑了较不近视的后向淘汰阶段的作用，使其比前向选择更有效。类似地，精确稀疏最小二乘(ESLS)的分支定界搜索也受益于划分矩阵技术。我们的贪婪稀疏最小二乘(GSLS)算法推广了natarajjanpsilas算法(Natarajan, 1995)，也称为顺序递归匹配追踪(ORMP)。具体来说，GSLS的正向传递完全等同于ORMP，但效率更高，并且通过加入只使计算量增加一倍的反向传递，我们可以实现比ORMP更低的MSE。在与LARS (Efron, 2004)的实验比较中，前向gsls不仅更高效和准确，而且在正则化选择方面也更灵活。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

查看原文本刊更多论文

Sparse regression as a sparse eigenvalue problem

We extend the l0-norm ldquosubspectralrdquo algorithms developed for sparse-LDA (Moghaddam, 2006) and sparse-PCA (Moghaddam, 2006) to more general quadratic costs such as MSE in linear (or kernel) regression. The resulting ldquosparse least squaresrdquo (SLS) problem is also NP-hard, by way of its equivalence to a rank-1 sparse eigenvalue problem. Specifically, for minimizing general quadratic cost functions we use a highly-efficient method for direct eigenvalue computation based on partitioned matrix inverse techniques that leads to times103 speed-ups over standard eigenvalue decomposition. This increased efficiency mitigates the O(n4) complexity that limited the previous algorithmspsila utility for high-dimensional problems. Moreover, the new computation prioritizes the role of the less-myopic backward elimination stage which becomes even more efficient than forward selection. Similarly, branch-and-bound search for exact sparse least squares (ESLS) also benefits from partitioned matrix techniques. Our greedy sparse least squares (GSLS) algorithm generalizes Natarajanpsilas algorithm (Natarajan, 1995) also known as order-recursive matching pursuit (ORMP). Specifically, the forward pass of GSLS is exactly equivalent to ORMP but is more efficient, and by including the backward pass, which only doubles the computation, we can achieve a lower MSE than ORMP. In experimental comparisons with LARS (Efron, 2004), forward-GSLS is shown to be not only more efficient and accurate but more flexible in terms of choice of regularization.

求助全文

通过发布文献求助，成功后即可免费获取论文全文。去求助

来源期刊

2008 Information Theory and Applications Workshop

自引率

0.00%

发文量