Xuemei Cao, Xiangkun Wang, Haoyang Liang, Bingjun Wei, Xin Yang
{"title":"Open continual sampling with hypersphere knowledge transfer for rapid feature selection","authors":"Xuemei Cao, Xiangkun Wang, Haoyang Liang, Bingjun Wei, Xin Yang","doi":"10.1016/j.asoc.2024.112664","DOIUrl":null,"url":null,"abstract":"<div><div>Feature selection is a widely used data preprocessing technique, but it still faces two major challenges: (1) data in open and dynamic environments may continually emerge unknown classes, and (2) the ever-growing scale of data. To address these challenges, this paper proposes a novel Open Continual Sampling (OCS) method that combines the advantages of continual learning and three-way sampling, aiming to discover unknown knowledge and transfer known knowledge. OCS can detect unknown classes by constructing a hypersphere knowledge base and sampling the most uncertain instances at each class decision boundary from the unknown data, thereby effectively reducing redundant sample computations. Based on OCS, we introduce a rapid feature selection framework (OCS-FS). Guided by the prior knowledge base, this framework rapidly calculates the importance of a small number of candidate features on representative samples, thereby incrementally selecting the optimal feature subset for the new data. After completing the learning process for the new period, the knowledge base is updated to reinforce old knowledge and integrate new knowledge. Extensive experiments on public benchmark datasets demonstrate that our method significantly outperforms existing state-of-the-art feature selection methods in both effectiveness and efficiency.</div></div>","PeriodicalId":50737,"journal":{"name":"Applied Soft Computing","volume":"170 ","pages":"Article 112664"},"PeriodicalIF":7.2000,"publicationDate":"2025-02-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Applied Soft Computing","FirstCategoryId":"94","ListUrlMain":"https://www.sciencedirect.com/science/article/pii/S1568494624014388","RegionNum":1,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q1","JCRName":"COMPUTER SCIENCE, ARTIFICIAL INTELLIGENCE","Score":null,"Total":0}
引用次数: 0
Abstract
Feature selection is a widely used data preprocessing technique, but it still faces two major challenges: (1) data in open and dynamic environments may continually emerge unknown classes, and (2) the ever-growing scale of data. To address these challenges, this paper proposes a novel Open Continual Sampling (OCS) method that combines the advantages of continual learning and three-way sampling, aiming to discover unknown knowledge and transfer known knowledge. OCS can detect unknown classes by constructing a hypersphere knowledge base and sampling the most uncertain instances at each class decision boundary from the unknown data, thereby effectively reducing redundant sample computations. Based on OCS, we introduce a rapid feature selection framework (OCS-FS). Guided by the prior knowledge base, this framework rapidly calculates the importance of a small number of candidate features on representative samples, thereby incrementally selecting the optimal feature subset for the new data. After completing the learning process for the new period, the knowledge base is updated to reinforce old knowledge and integrate new knowledge. Extensive experiments on public benchmark datasets demonstrate that our method significantly outperforms existing state-of-the-art feature selection methods in both effectiveness and efficiency.
期刊介绍:
Applied Soft Computing is an international journal promoting an integrated view of soft computing to solve real life problems.The focus is to publish the highest quality research in application and convergence of the areas of Fuzzy Logic, Neural Networks, Evolutionary Computing, Rough Sets and other similar techniques to address real world complexities.
Applied Soft Computing is a rolling publication: articles are published as soon as the editor-in-chief has accepted them. Therefore, the web site will continuously be updated with new articles and the publication time will be short.