Compressed sensing: a discrete optimization approach

IF 2.9 3区计算机科学 Q2 COMPUTER SCIENCE, ARTIFICIAL INTELLIGENCE

Machine Learning Pub Date : 2024-07-11 DOI:10.1007/s10994-024-06577-0

Dimitris Bertsimas, Nicholas A. G. Johnson

{"title":"Compressed sensing: a discrete optimization approach","authors":"Dimitris Bertsimas, Nicholas A. G. Johnson","doi":"10.1007/s10994-024-06577-0","DOIUrl":null,"url":null,"abstract":"We study the Compressed Sensing (CS) problem, which is the problem of finding the most sparse vector that satisfies a set of linear measurements up to some numerical tolerance. CS is a central problem in Statistics, Operations Research and Machine Learning which arises in applications such as signal processing, data compression, image reconstruction, and multi-label learning. We introduce an \\(\\ell _2\\) regularized formulation of CS which we reformulate as a mixed integer second order cone program. We derive a second order cone relaxation of this problem and show that under mild conditions on the regularization parameter, the resulting relaxation is equivalent to the well studied basis pursuit denoising problem. We present a semidefinite relaxation that strengthens the second order cone relaxation and develop a custom branch-and-bound algorithm that leverages our second order cone relaxation to solve small-scale instances of CS to certifiable optimality. When compared against solutions produced by three state of the art benchmark methods on synthetic data, our numerical results show that our approach produces solutions that are on average \\(6.22\\%\\) more sparse. When compared only against the experiment-wise best performing benchmark method on synthetic data, our approach produces solutions that are on average \\(3.10\\%\\) more sparse. On real world ECG data, for a given \\(\\ell _2\\) reconstruction error our approach produces solutions that are on average \\(9.95\\%\\) more sparse than benchmark methods (\\(3.88\\%\\) more sparse if only compared against the best performing benchmark), while for a given sparsity level our approach produces solutions that have on average \\(10.77\\%\\) lower reconstruction error than benchmark methods (\\(1.42\\%\\) lower error if only compared against the best performing benchmark). When used as a component of a multi-label classification algorithm, our approach achieves greater classification accuracy than benchmark compressed sensing methods. This improved accuracy comes at the cost of an increase in computation time by several orders of magnitude. Thus, for applications where runtime is not of critical importance, leveraging integer optimization can yield sparser and lower error solutions to CS than existing benchmarks.","PeriodicalId":49900,"journal":{"name":"Machine Learning","volume":"56 1","pages":""},"PeriodicalIF":2.9000,"publicationDate":"2024-07-11","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Machine Learning","FirstCategoryId":"94","ListUrlMain":"https://doi.org/10.1007/s10994-024-06577-0","RegionNum":3,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q2","JCRName":"COMPUTER SCIENCE, ARTIFICIAL INTELLIGENCE","Score":null,"Total":0}

引用次数: 0

Abstract

We study the Compressed Sensing (CS) problem, which is the problem of finding the most sparse vector that satisfies a set of linear measurements up to some numerical tolerance. CS is a central problem in Statistics, Operations Research and Machine Learning which arises in applications such as signal processing, data compression, image reconstruction, and multi-label learning. We introduce an \(\ell _2\) regularized formulation of CS which we reformulate as a mixed integer second order cone program. We derive a second order cone relaxation of this problem and show that under mild conditions on the regularization parameter, the resulting relaxation is equivalent to the well studied basis pursuit denoising problem. We present a semidefinite relaxation that strengthens the second order cone relaxation and develop a custom branch-and-bound algorithm that leverages our second order cone relaxation to solve small-scale instances of CS to certifiable optimality. When compared against solutions produced by three state of the art benchmark methods on synthetic data, our numerical results show that our approach produces solutions that are on average \(6.22\%\) more sparse. When compared only against the experiment-wise best performing benchmark method on synthetic data, our approach produces solutions that are on average \(3.10\%\) more sparse. On real world ECG data, for a given \(\ell _2\) reconstruction error our approach produces solutions that are on average \(9.95\%\) more sparse than benchmark methods (\(3.88\%\) more sparse if only compared against the best performing benchmark), while for a given sparsity level our approach produces solutions that have on average \(10.77\%\) lower reconstruction error than benchmark methods (\(1.42\%\) lower error if only compared against the best performing benchmark). When used as a component of a multi-label classification algorithm, our approach achieves greater classification accuracy than benchmark compressed sensing methods. This improved accuracy comes at the cost of an increase in computation time by several orders of magnitude. Thus, for applications where runtime is not of critical importance, leveraging integer optimization can yield sparser and lower error solutions to CS than existing benchmarks.

Abstract Image

查看原文本刊更多论文

压缩传感：一种离散优化方法

我们研究的是压缩传感（CS）问题，即寻找满足一组线性测量的最稀疏矢量，并达到一定的数值容差。CS 是统计学、运筹学和机器学习中的一个核心问题，在信号处理、数据压缩、图像重建和多标签学习等应用中都会出现。我们引入了 CS 的正则化表述，并将其重新表述为混合整数二阶锥形程序。我们推导出了这个问题的二阶圆锥松弛，并证明在正则化参数的温和条件下，所得到的松弛等价于研究得很透彻的基追求去噪问题。我们提出了一种加强二阶圆锥松弛的半有限松弛，并开发了一种定制的分支和边界算法，该算法利用我们的二阶圆锥松弛来解决 CS 的小规模实例，并达到可证明的最优性。与三种最先进的基准方法在合成数据上得出的解决方案相比，我们的数值结果表明，我们的方法得出的解决方案平均稀疏度更高（6.22%）。如果只与合成数据上实验性能最好的基准方法进行比较，我们的方法得出的解决方案平均稀疏度要更高（3.10%）。在真实世界的心电图数据上，对于给定的重构误差，我们的方法产生的解决方案比基准方法平均稀疏（9.95%）（如果只与表现最好的基准方法相比，则稀疏（3.88%）），而对于给定的稀疏程度，我们的方法产生的解决方案的重构误差比基准方法平均低（10.77%）（如果只与表现最好的基准方法相比，则误差低（1.42%））。当作为多标签分类算法的一个组成部分时，我们的方法比基准压缩传感方法实现了更高的分类精度。精度提高的代价是计算时间增加了几个数量级。因此，对于运行时间并不重要的应用，利用整数优化可以获得比现有基准更稀疏、误差更低的 CS 解决方案。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

求助全文

约1分钟内获得全文求助全文

来源期刊

Machine Learning 工程技术-计算机：人工智能

CiteScore

11.00

自引率

2.70%

发文量

162

审稿时长

3 months

期刊介绍： Machine Learning serves as a global platform dedicated to computational approaches in learning. The journal reports substantial findings on diverse learning methods applied to various problems, offering support through empirical studies, theoretical analysis, or connections to psychological phenomena. It demonstrates the application of learning methods to solve significant problems and aims to enhance the conduct of machine learning research with a focus on verifiable and replicable evidence in published papers.