Miguel Carrasco , Benjamin Ivorra , Julio López , Angel M. Ramos
{"title":"Embedded feature selection for robust probability learning machines","authors":"Miguel Carrasco , Benjamin Ivorra , Julio López , Angel M. Ramos","doi":"10.1016/j.patcog.2024.111157","DOIUrl":null,"url":null,"abstract":"<div><h3>Methods:</h3><div>Feature selection is essential for building effective machine learning models in binary classification. Eliminating unnecessary features can reduce the risk of overfitting and improve classification performance. Moreover, the data we handle typically contains a stochastic component, making it important to develop robust models that are insensitive to data perturbations. Although there are numerous methods and tools for feature selection, relatively few studies address embedded feature selection within robust classification models using penalization techniques.</div></div><div><h3>Objective:</h3><div>In this work, we introduce robust classifiers with integrated feature selection capabilities, utilizing probability machines based on different penalization techniques, such as the <span><math><msub><mrow><mi>ℓ</mi></mrow><mrow><mn>1</mn></mrow></msub></math></span>-norm or the elastic-net, combined with a novel Direct Feature Elimination process to improve model resilience and efficiency.</div></div><div><h3>Findings:</h3><div>Numerical experiments on standard datasets demonstrate the effectiveness and robustness of the proposed models in classification tasks even when using a reduced number of features. These experiments were evaluated using original performance indicators, highlighting the models’ ability to maintain high performance with fewer features.</div></div><div><h3>Novelty:</h3><div>The study discusses the trade-offs involved in combining different penalties to select the most relevant features while minimizing empirical risk. In particular, the integration of elastic-net and <span><math><msub><mrow><mi>ℓ</mi></mrow><mrow><mn>1</mn></mrow></msub></math></span>-norm penalties within a unified framework, combined with the original Direct Feature Elimination approach, presents a novel method for improving both model accuracy and robustness.</div></div>","PeriodicalId":49713,"journal":{"name":"Pattern Recognition","volume":"159 ","pages":"Article 111157"},"PeriodicalIF":7.5000,"publicationDate":"2024-11-12","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Pattern Recognition","FirstCategoryId":"94","ListUrlMain":"https://www.sciencedirect.com/science/article/pii/S0031320324009087","RegionNum":1,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q1","JCRName":"COMPUTER SCIENCE, ARTIFICIAL INTELLIGENCE","Score":null,"Total":0}
引用次数: 0
Abstract
Methods:
Feature selection is essential for building effective machine learning models in binary classification. Eliminating unnecessary features can reduce the risk of overfitting and improve classification performance. Moreover, the data we handle typically contains a stochastic component, making it important to develop robust models that are insensitive to data perturbations. Although there are numerous methods and tools for feature selection, relatively few studies address embedded feature selection within robust classification models using penalization techniques.
Objective:
In this work, we introduce robust classifiers with integrated feature selection capabilities, utilizing probability machines based on different penalization techniques, such as the -norm or the elastic-net, combined with a novel Direct Feature Elimination process to improve model resilience and efficiency.
Findings:
Numerical experiments on standard datasets demonstrate the effectiveness and robustness of the proposed models in classification tasks even when using a reduced number of features. These experiments were evaluated using original performance indicators, highlighting the models’ ability to maintain high performance with fewer features.
Novelty:
The study discusses the trade-offs involved in combining different penalties to select the most relevant features while minimizing empirical risk. In particular, the integration of elastic-net and -norm penalties within a unified framework, combined with the original Direct Feature Elimination approach, presents a novel method for improving both model accuracy and robustness.
期刊介绍:
The field of Pattern Recognition is both mature and rapidly evolving, playing a crucial role in various related fields such as computer vision, image processing, text analysis, and neural networks. It closely intersects with machine learning and is being applied in emerging areas like biometrics, bioinformatics, multimedia data analysis, and data science. The journal Pattern Recognition, established half a century ago during the early days of computer science, has since grown significantly in scope and influence.