MIP-BOOST: Efficient and Effective L ₀ Feature Selection for Linear Regression.

IF 1.8

Journal of computational and graphical statistics : a joint publication of American Statistical Association, Institute of Mathematical Statistics, Interface Foundation of North America Pub Date : 2021-01-01 Epub Date: 2021-01-04 DOI:10.1080/10618600.2020.1845184

Ana Kenney, Francesca Chiaromonte, Giovanni Felici

{"title":"MIP-BOOST: Efficient and Effective L 0 Feature Selection for Linear Regression.","authors":"Ana Kenney, Francesca Chiaromonte, Giovanni Felici","doi":"10.1080/10618600.2020.1845184","DOIUrl":null,"url":null,"abstract":"Recent advances in mathematical programming have made Mixed Integer Optimization a competitive alternative to popular regularization methods for selecting features in regression problems. The approach exhibits unquestionable foundational appeal and versatility, but also poses important challenges. Here we propose MIP-BOOST, a revision of standard Mixed Integer Programming feature selection that reduces the computational burden of tuning the critical sparsity bound parameter and improves performance in the presence of feature collinearity and of signals that vary in nature and strength. The final outcome is a more efficient and effective L 0 Feature Selection method for applications of realistic size and complexity, grounded on rigorous cross-validation tuning and exact optimization of the associated Mixed Integer Program. Computational viability and improved performance in realistic scenarios is achieved through three independent but synergistic proposals. Supplementary materials including additional results, pseudocode, and computer code are available online.","PeriodicalId":520666,"journal":{"name":"Journal of computational and graphical statistics : a joint publication of American Statistical Association, Institute of Mathematical Statistics, Interface Foundation of North America","volume":" ","pages":"566-577"},"PeriodicalIF":1.8000,"publicationDate":"2021-01-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://sci-hub-pdf.com/10.1080/10618600.2020.1845184","citationCount":"12","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Journal of computational and graphical statistics : a joint publication of American Statistical Association, Institute of Mathematical Statistics, Interface Foundation of North America","FirstCategoryId":"100","ListUrlMain":"https://doi.org/10.1080/10618600.2020.1845184","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"2021/1/4 0:00:00","PubModel":"Epub","JCR":"","JCRName":"","Score":null,"Total":0}

引用次数: 12

Abstract

Recent advances in mathematical programming have made Mixed Integer Optimization a competitive alternative to popular regularization methods for selecting features in regression problems. The approach exhibits unquestionable foundational appeal and versatility, but also poses important challenges. Here we propose MIP-BOOST, a revision of standard Mixed Integer Programming feature selection that reduces the computational burden of tuning the critical sparsity bound parameter and improves performance in the presence of feature collinearity and of signals that vary in nature and strength. The final outcome is a more efficient and effective L ₀ Feature Selection method for applications of realistic size and complexity, grounded on rigorous cross-validation tuning and exact optimization of the associated Mixed Integer Program. Computational viability and improved performance in realistic scenarios is achieved through three independent but synergistic proposals. Supplementary materials including additional results, pseudocode, and computer code are available online.

查看原文本刊更多论文

MIP-BOOST:高效和有效的线性回归l0特征选择。

数学规划的最新进展使混合整数优化成为回归问题中选择特征的流行正则化方法的竞争性替代方法。该方法展示了毋庸置疑的基础吸引力和多功能性，但也提出了重要的挑战。在这里，我们提出了MIP-BOOST，这是标准混合整数规划特征选择的修订，它减少了调整关键稀疏界参数的计算负担，并提高了特征共线性存在和信号性质和强度变化时的性能。最终的结果是基于严格的交叉验证调优和相关混合整数程序的精确优化，为实际尺寸和复杂性的应用提供了更高效和有效的l0特征选择方法。通过三个独立但协同的建议，实现了现实场景下的计算可行性和改进性能。包括附加结果、伪代码和计算机代码在内的补充材料可在网上获得。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

求助全文

约1分钟内获得全文求助全文

来源期刊

Journal of computational and graphical statistics : a joint publication of American Statistical Association, Institute of Mathematical Statistics, Interface Foundation of North America

自引率

0.00%

发文量