{"title":"基于去噪熵的有限样本工业数据鲁棒特征选择","authors":"Chan Xu;Silu Chen;Xiangjie Kong;Chi Zhang;Guilin Yang;Zaojun Fang","doi":"10.1109/TII.2025.3534417","DOIUrl":null,"url":null,"abstract":"Feature selection is challenging in high-dimensional and small-sample data, particularly in industrial informatics with diverse noise sources. The information entropy of feature noise is included in mutual information of a label and noise-corrupted features, which can be removed to increase classification accuracy. In this article, we propose a robust feature selection method by eliminating feature noise in the relevance measure. Feature noise is modeled as a zero-mean censored normal distribution, so its entropy is determined by solving the variance equation based on the maximum entropy principle. Then, a noisy channel for feature transmission is proposed to extract class-relevant noise component. Furthermore, a noise-free mutual information metric is developed by removing noise entropy within mutual information. Eventually, a novel criterion is proposed by maximizing relevance based on noise-free mutual information while minimizing redundancy. Experimental results confirm the effectiveness of our approach on datasets from various industrial sectors.","PeriodicalId":13301,"journal":{"name":"IEEE Transactions on Industrial Informatics","volume":"21 5","pages":"3913-3923"},"PeriodicalIF":9.9000,"publicationDate":"2025-02-14","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":"{\"title\":\"Robust Feature Selection by Removing Noise Entropy Within Mutual Information for Limited-Sample Industrial Data\",\"authors\":\"Chan Xu;Silu Chen;Xiangjie Kong;Chi Zhang;Guilin Yang;Zaojun Fang\",\"doi\":\"10.1109/TII.2025.3534417\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"Feature selection is challenging in high-dimensional and small-sample data, particularly in industrial informatics with diverse noise sources. The information entropy of feature noise is included in mutual information of a label and noise-corrupted features, which can be removed to increase classification accuracy. In this article, we propose a robust feature selection method by eliminating feature noise in the relevance measure. Feature noise is modeled as a zero-mean censored normal distribution, so its entropy is determined by solving the variance equation based on the maximum entropy principle. Then, a noisy channel for feature transmission is proposed to extract class-relevant noise component. Furthermore, a noise-free mutual information metric is developed by removing noise entropy within mutual information. Eventually, a novel criterion is proposed by maximizing relevance based on noise-free mutual information while minimizing redundancy. Experimental results confirm the effectiveness of our approach on datasets from various industrial sectors.\",\"PeriodicalId\":13301,\"journal\":{\"name\":\"IEEE Transactions on Industrial Informatics\",\"volume\":\"21 5\",\"pages\":\"3913-3923\"},\"PeriodicalIF\":9.9000,\"publicationDate\":\"2025-02-14\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"0\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"IEEE Transactions on Industrial Informatics\",\"FirstCategoryId\":\"94\",\"ListUrlMain\":\"https://ieeexplore.ieee.org/document/10887394/\",\"RegionNum\":1,\"RegionCategory\":\"计算机科学\",\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"Q1\",\"JCRName\":\"AUTOMATION & CONTROL SYSTEMS\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"IEEE Transactions on Industrial Informatics","FirstCategoryId":"94","ListUrlMain":"https://ieeexplore.ieee.org/document/10887394/","RegionNum":1,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q1","JCRName":"AUTOMATION & CONTROL SYSTEMS","Score":null,"Total":0}
Robust Feature Selection by Removing Noise Entropy Within Mutual Information for Limited-Sample Industrial Data
Feature selection is challenging in high-dimensional and small-sample data, particularly in industrial informatics with diverse noise sources. The information entropy of feature noise is included in mutual information of a label and noise-corrupted features, which can be removed to increase classification accuracy. In this article, we propose a robust feature selection method by eliminating feature noise in the relevance measure. Feature noise is modeled as a zero-mean censored normal distribution, so its entropy is determined by solving the variance equation based on the maximum entropy principle. Then, a noisy channel for feature transmission is proposed to extract class-relevant noise component. Furthermore, a noise-free mutual information metric is developed by removing noise entropy within mutual information. Eventually, a novel criterion is proposed by maximizing relevance based on noise-free mutual information while minimizing redundancy. Experimental results confirm the effectiveness of our approach on datasets from various industrial sectors.
期刊介绍:
The IEEE Transactions on Industrial Informatics is a multidisciplinary journal dedicated to publishing technical papers that connect theory with practical applications of informatics in industrial settings. It focuses on the utilization of information in intelligent, distributed, and agile industrial automation and control systems. The scope includes topics such as knowledge-based and AI-enhanced automation, intelligent computer control systems, flexible and collaborative manufacturing, industrial informatics in software-defined vehicles and robotics, computer vision, industrial cyber-physical and industrial IoT systems, real-time and networked embedded systems, security in industrial processes, industrial communications, systems interoperability, and human-machine interaction.