I2QD: Unsupervised feature selection via information quality, quantity, and difference degree

IF 7.4 1区 管理学 Q1 COMPUTER SCIENCE, INFORMATION SYSTEMS
Pengfei Zhang , Yuxin Zhao , Lvhui Hu , Dexian Wang , Lilan Peng , Zhong Li , Herwig Unger , Tianrui Li
{"title":"I2QD: Unsupervised feature selection via information quality, quantity, and difference degree","authors":"Pengfei Zhang ,&nbsp;Yuxin Zhao ,&nbsp;Lvhui Hu ,&nbsp;Dexian Wang ,&nbsp;Lilan Peng ,&nbsp;Zhong Li ,&nbsp;Herwig Unger ,&nbsp;Tianrui Li","doi":"10.1016/j.ipm.2025.104173","DOIUrl":null,"url":null,"abstract":"<div><div>In the era of big data, datasets often contain a large number of features with great uncertainty and ambiguity, which makes it challenging to identify features of value in downstream tasks. Traditional unsupervised feature selection methods struggle to effectively handle uncertain or fuzzy information, as they often treat information quality and information quantity separately, leading to suboptimal feature selection. To address this limitation, we propose a novel information representation system that integrates fuzzy relations with information source values, enabling a unified framework for quantifying both the quality and quantity of information. Within this system, we introduce two key feature selection criteria: the information evaluation score (IES), which assesses the quality and quantity of information, and the difference degree (DD), which measures the difference between selected and unselected features. Based on these criteria, we develop an unsupervised feature selection algorithm that accounts for the <u>I</u>nformation <u>Q</u>uantity, <u>Q</u>uality and <u>D</u>ifference <u>D</u>egree of feature (I2QD). The I2QD algorithm effectively selects features by balancing information quality, quantity, and difference, even in the presence of uncertainty. Finally, experimental findings support the efficacy of our proposed I2QD algorithm, offering a promising solution for feature selection.</div></div>","PeriodicalId":50365,"journal":{"name":"Information Processing & Management","volume":"62 5","pages":"Article 104173"},"PeriodicalIF":7.4000,"publicationDate":"2025-04-09","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Information Processing & Management","FirstCategoryId":"94","ListUrlMain":"https://www.sciencedirect.com/science/article/pii/S0306457325001141","RegionNum":1,"RegionCategory":"管理学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q1","JCRName":"COMPUTER SCIENCE, INFORMATION SYSTEMS","Score":null,"Total":0}
引用次数: 0

Abstract

In the era of big data, datasets often contain a large number of features with great uncertainty and ambiguity, which makes it challenging to identify features of value in downstream tasks. Traditional unsupervised feature selection methods struggle to effectively handle uncertain or fuzzy information, as they often treat information quality and information quantity separately, leading to suboptimal feature selection. To address this limitation, we propose a novel information representation system that integrates fuzzy relations with information source values, enabling a unified framework for quantifying both the quality and quantity of information. Within this system, we introduce two key feature selection criteria: the information evaluation score (IES), which assesses the quality and quantity of information, and the difference degree (DD), which measures the difference between selected and unselected features. Based on these criteria, we develop an unsupervised feature selection algorithm that accounts for the Information Quantity, Quality and Difference Degree of feature (I2QD). The I2QD algorithm effectively selects features by balancing information quality, quantity, and difference, even in the presence of uncertainty. Finally, experimental findings support the efficacy of our proposed I2QD algorithm, offering a promising solution for feature selection.
I2QD:基于信息质量、数量和差异程度的无监督特征选择
在大数据时代,数据集往往包含大量的特征,具有很大的不确定性和模糊性,这给下游任务识别有价值的特征带来了挑战。传统的无监督特征选择方法往往将信息质量和信息量分开处理,难以有效地处理不确定或模糊信息,从而导致特征选择的次优。为了解决这一限制,我们提出了一种新的信息表示系统,该系统将模糊关系与信息源值集成在一起,为量化信息的质量和数量提供了统一的框架。在该系统中,我们引入了两个关键的特征选择标准:信息评价分数(IES),用于评估信息的质量和数量,以及差异程度(DD),用于衡量选择和未选择特征之间的差异。基于这些标准,我们开发了一种考虑特征信息量、质量和差异度(I2QD)的无监督特征选择算法。即使存在不确定性,I2QD算法也能通过平衡信息的质量、数量和差异来有效地选择特征。最后,实验结果支持了我们提出的I2QD算法的有效性,为特征选择提供了一个有希望的解决方案。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
求助全文
约1分钟内获得全文 求助全文
来源期刊
Information Processing & Management
Information Processing & Management 工程技术-计算机:信息系统
CiteScore
17.00
自引率
11.60%
发文量
276
审稿时长
39 days
期刊介绍: Information Processing and Management is dedicated to publishing cutting-edge original research at the convergence of computing and information science. Our scope encompasses theory, methods, and applications across various domains, including advertising, business, health, information science, information technology marketing, and social computing. We aim to cater to the interests of both primary researchers and practitioners by offering an effective platform for the timely dissemination of advanced and topical issues in this interdisciplinary field. The journal places particular emphasis on original research articles, research survey articles, research method articles, and articles addressing critical applications of research. Join us in advancing knowledge and innovation at the intersection of computing and information science.
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
确定
请完成安全验证×
copy
已复制链接
快去分享给好友吧!
我知道了
右上角分享
点击右上角分享
0
联系我们:info@booksci.cn Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。 Copyright © 2023 布克学术 All rights reserved.
京ICP备2023020795号-1
ghs 京公网安备 11010802042870号
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术官方微信