Pengfei Zhang , Yuxin Zhao , Lvhui Hu , Dexian Wang , Lilan Peng , Zhong Li , Herwig Unger , Tianrui Li
{"title":"I2QD: Unsupervised feature selection via information quality, quantity, and difference degree","authors":"Pengfei Zhang , Yuxin Zhao , Lvhui Hu , Dexian Wang , Lilan Peng , Zhong Li , Herwig Unger , Tianrui Li","doi":"10.1016/j.ipm.2025.104173","DOIUrl":null,"url":null,"abstract":"<div><div>In the era of big data, datasets often contain a large number of features with great uncertainty and ambiguity, which makes it challenging to identify features of value in downstream tasks. Traditional unsupervised feature selection methods struggle to effectively handle uncertain or fuzzy information, as they often treat information quality and information quantity separately, leading to suboptimal feature selection. To address this limitation, we propose a novel information representation system that integrates fuzzy relations with information source values, enabling a unified framework for quantifying both the quality and quantity of information. Within this system, we introduce two key feature selection criteria: the information evaluation score (IES), which assesses the quality and quantity of information, and the difference degree (DD), which measures the difference between selected and unselected features. Based on these criteria, we develop an unsupervised feature selection algorithm that accounts for the <u>I</u>nformation <u>Q</u>uantity, <u>Q</u>uality and <u>D</u>ifference <u>D</u>egree of feature (I2QD). The I2QD algorithm effectively selects features by balancing information quality, quantity, and difference, even in the presence of uncertainty. Finally, experimental findings support the efficacy of our proposed I2QD algorithm, offering a promising solution for feature selection.</div></div>","PeriodicalId":50365,"journal":{"name":"Information Processing & Management","volume":"62 5","pages":"Article 104173"},"PeriodicalIF":7.4000,"publicationDate":"2025-04-09","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Information Processing & Management","FirstCategoryId":"94","ListUrlMain":"https://www.sciencedirect.com/science/article/pii/S0306457325001141","RegionNum":1,"RegionCategory":"管理学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q1","JCRName":"COMPUTER SCIENCE, INFORMATION SYSTEMS","Score":null,"Total":0}
引用次数: 0
Abstract
In the era of big data, datasets often contain a large number of features with great uncertainty and ambiguity, which makes it challenging to identify features of value in downstream tasks. Traditional unsupervised feature selection methods struggle to effectively handle uncertain or fuzzy information, as they often treat information quality and information quantity separately, leading to suboptimal feature selection. To address this limitation, we propose a novel information representation system that integrates fuzzy relations with information source values, enabling a unified framework for quantifying both the quality and quantity of information. Within this system, we introduce two key feature selection criteria: the information evaluation score (IES), which assesses the quality and quantity of information, and the difference degree (DD), which measures the difference between selected and unselected features. Based on these criteria, we develop an unsupervised feature selection algorithm that accounts for the Information Quantity, Quality and Difference Degree of feature (I2QD). The I2QD algorithm effectively selects features by balancing information quality, quantity, and difference, even in the presence of uncertainty. Finally, experimental findings support the efficacy of our proposed I2QD algorithm, offering a promising solution for feature selection.
期刊介绍:
Information Processing and Management is dedicated to publishing cutting-edge original research at the convergence of computing and information science. Our scope encompasses theory, methods, and applications across various domains, including advertising, business, health, information science, information technology marketing, and social computing.
We aim to cater to the interests of both primary researchers and practitioners by offering an effective platform for the timely dissemination of advanced and topical issues in this interdisciplinary field. The journal places particular emphasis on original research articles, research survey articles, research method articles, and articles addressing critical applications of research. Join us in advancing knowledge and innovation at the intersection of computing and information science.