Embedded feature selection for robust probability learning machines

IF 7.5 1区计算机科学 Q1 COMPUTER SCIENCE, ARTIFICIAL INTELLIGENCE

Pattern Recognition Pub Date : 2024-11-12 DOI:10.1016/j.patcog.2024.111157

Miguel Carrasco , Benjamin Ivorra , Julio López , Angel M. Ramos

{"title":"Embedded feature selection for robust probability learning machines","authors":"Miguel Carrasco , Benjamin Ivorra , Julio López , Angel M. Ramos","doi":"10.1016/j.patcog.2024.111157","DOIUrl":null,"url":null,"abstract":"<div><h3>Methods:</h3><div>Feature selection is essential for building effective machine learning models in binary classification. Eliminating unnecessary features can reduce the risk of overfitting and improve classification performance. Moreover, the data we handle typically contains a stochastic component, making it important to develop robust models that are insensitive to data perturbations. Although there are numerous methods and tools for feature selection, relatively few studies address embedded feature selection within robust classification models using penalization techniques.</div></div><div><h3>Objective:</h3><div>In this work, we introduce robust classifiers with integrated feature selection capabilities, utilizing probability machines based on different penalization techniques, such as the <span><math><msub><mrow><mi>ℓ</mi></mrow><mrow><mn>1</mn></mrow></msub></math></span>-norm or the elastic-net, combined with a novel Direct Feature Elimination process to improve model resilience and efficiency.</div></div><div><h3>Findings:</h3><div>Numerical experiments on standard datasets demonstrate the effectiveness and robustness of the proposed models in classification tasks even when using a reduced number of features. These experiments were evaluated using original performance indicators, highlighting the models’ ability to maintain high performance with fewer features.</div></div><div><h3>Novelty:</h3><div>The study discusses the trade-offs involved in combining different penalties to select the most relevant features while minimizing empirical risk. In particular, the integration of elastic-net and <span><math><msub><mrow><mi>ℓ</mi></mrow><mrow><mn>1</mn></mrow></msub></math></span>-norm penalties within a unified framework, combined with the original Direct Feature Elimination approach, presents a novel method for improving both model accuracy and robustness.</div></div>","PeriodicalId":49713,"journal":{"name":"Pattern Recognition","volume":"159 ","pages":"Article 111157"},"PeriodicalIF":7.5000,"publicationDate":"2024-11-12","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Pattern Recognition","FirstCategoryId":"94","ListUrlMain":"https://www.sciencedirect.com/science/article/pii/S0031320324009087","RegionNum":1,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q1","JCRName":"COMPUTER SCIENCE, ARTIFICIAL INTELLIGENCE","Score":null,"Total":0}

引用次数: 0

Abstract

Methods:

Feature selection is essential for building effective machine learning models in binary classification. Eliminating unnecessary features can reduce the risk of overfitting and improve classification performance. Moreover, the data we handle typically contains a stochastic component, making it important to develop robust models that are insensitive to data perturbations. Although there are numerous methods and tools for feature selection, relatively few studies address embedded feature selection within robust classification models using penalization techniques.

Objective:

In this work, we introduce robust classifiers with integrated feature selection capabilities, utilizing probability machines based on different penalization techniques, such as the

ℓ_{1}

-norm or the elastic-net, combined with a novel Direct Feature Elimination process to improve model resilience and efficiency.

Findings:

Numerical experiments on standard datasets demonstrate the effectiveness and robustness of the proposed models in classification tasks even when using a reduced number of features. These experiments were evaluated using original performance indicators, highlighting the models’ ability to maintain high performance with fewer features.

Novelty:

The study discusses the trade-offs involved in combining different penalties to select the most relevant features while minimizing empirical risk. In particular, the integration of elastic-net and

ℓ_{1}

-norm penalties within a unified framework, combined with the original Direct Feature Elimination approach, presents a novel method for improving both model accuracy and robustness.

查看原文本刊更多论文

稳健概率学习机的嵌入式特征选择

方法：特征选择对于建立有效的二元分类机器学习模型至关重要。消除不必要的特征可以降低过拟合风险，提高分类性能。此外，我们处理的数据通常包含随机成分，因此开发对数据扰动不敏感的稳健模型非常重要。目标：在这项工作中，我们引入了具有集成特征选择功能的鲁棒分类器，利用基于不同惩罚技术（如 ℓ1-norm 或 elastic-net）的概率机，结合新颖的直接特征消除过程来提高模型的弹性和效率。新颖性：该研究讨论了在结合不同惩罚措施以选择最相关特征的同时尽量减少经验风险所涉及的权衡问题。特别是，在一个统一的框架内整合弹性网和ℓ1-norm 惩罚，并结合原始的直接特征消除方法，为提高模型的准确性和鲁棒性提供了一种新方法。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

求助全文

约1分钟内获得全文求助全文

来源期刊

Pattern Recognition 工程技术-工程：电子与电气

CiteScore

14.40

自引率

16.20%

发文量

683

审稿时长

5.6 months

期刊介绍： The field of Pattern Recognition is both mature and rapidly evolving, playing a crucial role in various related fields such as computer vision, image processing, text analysis, and neural networks. It closely intersects with machine learning and is being applied in emerging areas like biometrics, bioinformatics, multimedia data analysis, and data science. The journal Pattern Recognition, established half a century ago during the early days of computer science, has since grown significantly in scope and influence.