{"title":"基于模糊弹性网的主成分分析特征选择","authors":"Yunlong Gao;Qinting Wu;Zhenghong Xu;Chao Cao;Jinyan Pan;Guifang Shao;Feiping Nie;Qingyuan Zhu","doi":"10.1109/TFUZZ.2024.3466926","DOIUrl":null,"url":null,"abstract":"Feature selection serves as a fundamental technique in machine learning and data analysis, playing a crucial role in extracting valuable features from large-scale and high-dimensional datasets that may contain irrelevant features. To enhance the performance of feature selection, regularizers like \n<inline-formula><tex-math>${\\ell _{1}}$</tex-math></inline-formula>\n-norm or \n<inline-formula><tex-math>${\\ell _{2,1}}$</tex-math></inline-formula>\n-norm are commonly utilized to encourage sparsity. Nonetheless, these traditional regularization techniques encounter certain challenges. When correlations exist among features, the sparsity-driven regularization can unfairly diminish weights of correlated features to zero, thus ignoring the feature correlations and lacking group sparsity properties. While a straightforward combination of \n<inline-formula><tex-math>${\\ell _{1}}$</tex-math></inline-formula>\n-norm and \n<inline-formula><tex-math>${\\ell _{2}}$</tex-math></inline-formula>\n-norm can uncover feature correlations, it lacks adaptability and effectively balancing sparsity and correlation. To address these challenges, we introduce a novel matrix-based regularization term, called a fuzzy elastic net, in the unsupervised feature selection model. Our model is founded on principal component analysis, a well-established dimensionality reduction technique adept at finding subspaces that retain most information from raw data. The model is enhanced by a fuzzy elastic net, which promotes group or sparsity properties through adaptive parameter tuning. The new regularization term introduces a flexible fuzzy weighted scheme combining the \n<inline-formula><tex-math>${\\ell _{2,2}}$</tex-math></inline-formula>\n-norm and \n<inline-formula><tex-math>${\\ell _{2,p}}$</tex-math></inline-formula>\n-norm (\n<inline-formula><tex-math>$0< p\\leq 1$</tex-math></inline-formula>\n). This approach allows adaptive adjustment based on data characteristics, offering a tunable balance between selecting discriminative features and identifying correlated ones. Consequently, this regularization term equips the model to handle diverse data analysis tasks flexibly, thereby enhancing adaptability and generalization performance. Furthermore, we propose an efficient optimization strategy to solve this model. Extensive experiments conducted on UCI datasets and real-world datasets demonstrate the effectiveness and efficiency of our proposed method.","PeriodicalId":13212,"journal":{"name":"IEEE Transactions on Fuzzy Systems","volume":"32 12","pages":"6878-6890"},"PeriodicalIF":10.7000,"publicationDate":"2024-12-02","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":"{\"title\":\"Principal Component Analysis With Fuzzy Elastic Net for Feature Selection\",\"authors\":\"Yunlong Gao;Qinting Wu;Zhenghong Xu;Chao Cao;Jinyan Pan;Guifang Shao;Feiping Nie;Qingyuan Zhu\",\"doi\":\"10.1109/TFUZZ.2024.3466926\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"Feature selection serves as a fundamental technique in machine learning and data analysis, playing a crucial role in extracting valuable features from large-scale and high-dimensional datasets that may contain irrelevant features. To enhance the performance of feature selection, regularizers like \\n<inline-formula><tex-math>${\\\\ell _{1}}$</tex-math></inline-formula>\\n-norm or \\n<inline-formula><tex-math>${\\\\ell _{2,1}}$</tex-math></inline-formula>\\n-norm are commonly utilized to encourage sparsity. Nonetheless, these traditional regularization techniques encounter certain challenges. When correlations exist among features, the sparsity-driven regularization can unfairly diminish weights of correlated features to zero, thus ignoring the feature correlations and lacking group sparsity properties. While a straightforward combination of \\n<inline-formula><tex-math>${\\\\ell _{1}}$</tex-math></inline-formula>\\n-norm and \\n<inline-formula><tex-math>${\\\\ell _{2}}$</tex-math></inline-formula>\\n-norm can uncover feature correlations, it lacks adaptability and effectively balancing sparsity and correlation. To address these challenges, we introduce a novel matrix-based regularization term, called a fuzzy elastic net, in the unsupervised feature selection model. Our model is founded on principal component analysis, a well-established dimensionality reduction technique adept at finding subspaces that retain most information from raw data. The model is enhanced by a fuzzy elastic net, which promotes group or sparsity properties through adaptive parameter tuning. The new regularization term introduces a flexible fuzzy weighted scheme combining the \\n<inline-formula><tex-math>${\\\\ell _{2,2}}$</tex-math></inline-formula>\\n-norm and \\n<inline-formula><tex-math>${\\\\ell _{2,p}}$</tex-math></inline-formula>\\n-norm (\\n<inline-formula><tex-math>$0< p\\\\leq 1$</tex-math></inline-formula>\\n). This approach allows adaptive adjustment based on data characteristics, offering a tunable balance between selecting discriminative features and identifying correlated ones. Consequently, this regularization term equips the model to handle diverse data analysis tasks flexibly, thereby enhancing adaptability and generalization performance. Furthermore, we propose an efficient optimization strategy to solve this model. Extensive experiments conducted on UCI datasets and real-world datasets demonstrate the effectiveness and efficiency of our proposed method.\",\"PeriodicalId\":13212,\"journal\":{\"name\":\"IEEE Transactions on Fuzzy Systems\",\"volume\":\"32 12\",\"pages\":\"6878-6890\"},\"PeriodicalIF\":10.7000,\"publicationDate\":\"2024-12-02\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"0\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"IEEE Transactions on Fuzzy Systems\",\"FirstCategoryId\":\"94\",\"ListUrlMain\":\"https://ieeexplore.ieee.org/document/10772384/\",\"RegionNum\":1,\"RegionCategory\":\"计算机科学\",\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"Q1\",\"JCRName\":\"COMPUTER SCIENCE, ARTIFICIAL INTELLIGENCE\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"IEEE Transactions on Fuzzy Systems","FirstCategoryId":"94","ListUrlMain":"https://ieeexplore.ieee.org/document/10772384/","RegionNum":1,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q1","JCRName":"COMPUTER SCIENCE, ARTIFICIAL INTELLIGENCE","Score":null,"Total":0}
Principal Component Analysis With Fuzzy Elastic Net for Feature Selection
Feature selection serves as a fundamental technique in machine learning and data analysis, playing a crucial role in extracting valuable features from large-scale and high-dimensional datasets that may contain irrelevant features. To enhance the performance of feature selection, regularizers like
${\ell _{1}}$
-norm or
${\ell _{2,1}}$
-norm are commonly utilized to encourage sparsity. Nonetheless, these traditional regularization techniques encounter certain challenges. When correlations exist among features, the sparsity-driven regularization can unfairly diminish weights of correlated features to zero, thus ignoring the feature correlations and lacking group sparsity properties. While a straightforward combination of
${\ell _{1}}$
-norm and
${\ell _{2}}$
-norm can uncover feature correlations, it lacks adaptability and effectively balancing sparsity and correlation. To address these challenges, we introduce a novel matrix-based regularization term, called a fuzzy elastic net, in the unsupervised feature selection model. Our model is founded on principal component analysis, a well-established dimensionality reduction technique adept at finding subspaces that retain most information from raw data. The model is enhanced by a fuzzy elastic net, which promotes group or sparsity properties through adaptive parameter tuning. The new regularization term introduces a flexible fuzzy weighted scheme combining the
${\ell _{2,2}}$
-norm and
${\ell _{2,p}}$
-norm (
$0< p\leq 1$
). This approach allows adaptive adjustment based on data characteristics, offering a tunable balance between selecting discriminative features and identifying correlated ones. Consequently, this regularization term equips the model to handle diverse data analysis tasks flexibly, thereby enhancing adaptability and generalization performance. Furthermore, we propose an efficient optimization strategy to solve this model. Extensive experiments conducted on UCI datasets and real-world datasets demonstrate the effectiveness and efficiency of our proposed method.
期刊介绍:
The IEEE Transactions on Fuzzy Systems is a scholarly journal that focuses on the theory, design, and application of fuzzy systems. It aims to publish high-quality technical papers that contribute significant technical knowledge and exploratory developments in the field of fuzzy systems. The journal particularly emphasizes engineering systems and scientific applications. In addition to research articles, the Transactions also includes a letters section featuring current information, comments, and rebuttals related to published papers.