OSFS-Vague: Online streaming feature selection algorithm based on vague set

IF 8.4 2区 计算机科学 Q1 COMPUTER SCIENCE, ARTIFICIAL INTELLIGENCE
Jie Yang, Zhijun Wang, Guoyin Wang, Yanmin Liu, Yi He, Di Wu
{"title":"OSFS-Vague: Online streaming feature selection algorithm based on vague set","authors":"Jie Yang,&nbsp;Zhijun Wang,&nbsp;Guoyin Wang,&nbsp;Yanmin Liu,&nbsp;Yi He,&nbsp;Di Wu","doi":"10.1049/cit2.12327","DOIUrl":null,"url":null,"abstract":"<p>Online streaming feature selection (OSFS), as an online learning manner to handle streaming features, is critical in addressing high-dimensional data. In real big data-related applications, the patterns and distributions of streaming features constantly change over time due to dynamic data generation environments. However, existing OSFS methods rely on presented and fixed hyperparameters, which undoubtedly lead to poor selection performance when encountering dynamic features. To make up for the existing shortcomings, the authors propose a novel OSFS algorithm based on vague set, named OSFS-Vague. Its main idea is to combine uncertainty and three-way decision theories to improve feature selection from the traditional dichotomous method to the trichotomous method. OSFS-Vague also improves the calculation method of correlation between features and labels. Moreover, OSFS-Vague uses the distance correlation coefficient to classify streaming features into relevant features, weakly redundant features, and redundant features. Finally, the relevant features and weakly redundant features are filtered for an optimal feature set. To evaluate the proposed OSFS-Vague, extensive empirical experiments have been conducted on 11 datasets. The results demonstrate that OSFS-Vague outperforms six state-of-the-art OSFS algorithms in terms of selection accuracy and computational efficiency.</p>","PeriodicalId":46211,"journal":{"name":"CAAI Transactions on Intelligence Technology","volume":"9 6","pages":"1451-1466"},"PeriodicalIF":8.4000,"publicationDate":"2024-04-08","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://onlinelibrary.wiley.com/doi/epdf/10.1049/cit2.12327","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"CAAI Transactions on Intelligence Technology","FirstCategoryId":"94","ListUrlMain":"https://onlinelibrary.wiley.com/doi/10.1049/cit2.12327","RegionNum":2,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q1","JCRName":"COMPUTER SCIENCE, ARTIFICIAL INTELLIGENCE","Score":null,"Total":0}
引用次数: 0

Abstract

Online streaming feature selection (OSFS), as an online learning manner to handle streaming features, is critical in addressing high-dimensional data. In real big data-related applications, the patterns and distributions of streaming features constantly change over time due to dynamic data generation environments. However, existing OSFS methods rely on presented and fixed hyperparameters, which undoubtedly lead to poor selection performance when encountering dynamic features. To make up for the existing shortcomings, the authors propose a novel OSFS algorithm based on vague set, named OSFS-Vague. Its main idea is to combine uncertainty and three-way decision theories to improve feature selection from the traditional dichotomous method to the trichotomous method. OSFS-Vague also improves the calculation method of correlation between features and labels. Moreover, OSFS-Vague uses the distance correlation coefficient to classify streaming features into relevant features, weakly redundant features, and redundant features. Finally, the relevant features and weakly redundant features are filtered for an optimal feature set. To evaluate the proposed OSFS-Vague, extensive empirical experiments have been conducted on 11 datasets. The results demonstrate that OSFS-Vague outperforms six state-of-the-art OSFS algorithms in terms of selection accuracy and computational efficiency.

Abstract Image

OSFS-Vague:基于模糊集的在线流媒体特征选择算法
在线流特征选择(OSFS)作为一种处理流特征的在线学习方式,对于处理高维数据至关重要。在实际的大数据相关应用中,由于动态数据生成环境的影响,流特征的模式和分布会随时间不断变化。然而,现有的 OSFS 方法依赖于呈现和固定的超参数,这无疑会导致在遇到动态特征时选择性能不佳。为了弥补现有缺陷,作者提出了一种基于模糊集的新型 OSFS 算法,命名为 OSFS-Vague。其主要思想是结合不确定性和三向决策理论,将传统的二分法改进为三分法,从而提高特征选择效果。OSFS-Vague 还改进了特征与标签之间相关性的计算方法。此外,OSFS-Vague 还利用距离相关系数将流特征分为相关特征、弱冗余特征和冗余特征。最后,对相关特征和弱冗余特征进行过滤,以获得最佳特征集。为了评估所提出的 OSFS-Vague,我们在 11 个数据集上进行了广泛的实证实验。结果表明,OSFS-Vague 在选择准确性和计算效率方面都优于六种最先进的 OSFS 算法。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
求助全文
约1分钟内获得全文 求助全文
来源期刊
CAAI Transactions on Intelligence Technology
CAAI Transactions on Intelligence Technology COMPUTER SCIENCE, ARTIFICIAL INTELLIGENCE-
CiteScore
11.00
自引率
3.90%
发文量
134
审稿时长
35 weeks
期刊介绍: CAAI Transactions on Intelligence Technology is a leading venue for original research on the theoretical and experimental aspects of artificial intelligence technology. We are a fully open access journal co-published by the Institution of Engineering and Technology (IET) and the Chinese Association for Artificial Intelligence (CAAI) providing research which is openly accessible to read and share worldwide.
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
确定
请完成安全验证×
copy
已复制链接
快去分享给好友吧!
我知道了
右上角分享
点击右上角分享
0
联系我们:info@booksci.cn Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。 Copyright © 2023 布克学术 All rights reserved.
京ICP备2023020795号-1
ghs 京公网安备 11010802042870号
Book学术文献互助
Book学术文献互助群
群 号:604180095
Book学术官方微信