用多轮优化策略增强高维数据的脱粒机制

IF 3.2 1区数学 Q2 COMPUTER SCIENCE, THEORY & METHODS

Fuzzy Sets and Systems Pub Date : 2024-04-04 DOI:10.1016/j.fss.2024.108969

Xiaoan Tang , Mingsong Duan , Kaijie Xu , Qiang Zhang

{"title":"用多轮优化策略增强高维数据的脱粒机制","authors":"Xiaoan Tang , Mingsong Duan , Kaijie Xu , Qiang Zhang","doi":"10.1016/j.fss.2024.108969","DOIUrl":null,"url":null,"abstract":"<div><p>Various fuzzy clustering-based granulation–degranulation techniques have been developed for constructing and optimizing information granules, which help reveal the underlying structure of experimental data in Granular Computing (GrC). Basically, a well-performing granulation–degranulation mechanism runs with a low degranulation (reconstruction) error. However, the increasingly high-dimensional characteristics of data bring great challenges to achieve accurate of reconstruction of high-dimensional data. As such, for the reconstruction of high-dimensional data, an important issue is how to reduce the reconstruction error such that the data could be reconstructed more accurately. In order to address the challenge of unacceptable high reconstruction error posed by the increase in data dimensions and improve the inefficient fuzzy clustering-based granulation in existing techniques, this study develops a multi-round iterative optimization strategy with the use of Fuzzy C-Means (FCM) to enhance reconstruction performance for high-dimensional data. First, we propose a Feature Sampling-based FCM (FS-FCM) scheme served as the granulation mechanism in the framework. The proposed scheme draws on the idea of ensemble learning, where the granulation of original high-dimensional data is accomplished by generating and training low-dimensional sub-datasets through multiple times of feature random sampling. Then, a multi-round iterative granulation–degranulation mechanism is proposed along with its algorithmic framework. Within the proposed framework, we attempt to reduce the reconstruction error by iteratively reconstructing the residual data generated in each round of granulation–degranulation. Finally, we validate the developed strategy framework over twelve publicly available datasets with varying dimension scales. A set of ablation experiments verifies the effectiveness of the FS-FCM granulation scheme. The near-perfect reconstruction performance achieved by the proposed iterative framework on the given datasets further demonstrates its superiority.</p></div>","PeriodicalId":55130,"journal":{"name":"Fuzzy Sets and Systems","volume":null,"pages":null},"PeriodicalIF":3.2000,"publicationDate":"2024-04-04","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":"{\"title\":\"Augmentation of degranulation mechanism for high-dimensional data with a multi-round optimization strategy\",\"authors\":\"Xiaoan Tang , Mingsong Duan , Kaijie Xu , Qiang Zhang\",\"doi\":\"10.1016/j.fss.2024.108969\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"<div><p>Various fuzzy clustering-based granulation–degranulation techniques have been developed for constructing and optimizing information granules, which help reveal the underlying structure of experimental data in Granular Computing (GrC). Basically, a well-performing granulation–degranulation mechanism runs with a low degranulation (reconstruction) error. However, the increasingly high-dimensional characteristics of data bring great challenges to achieve accurate of reconstruction of high-dimensional data. As such, for the reconstruction of high-dimensional data, an important issue is how to reduce the reconstruction error such that the data could be reconstructed more accurately. In order to address the challenge of unacceptable high reconstruction error posed by the increase in data dimensions and improve the inefficient fuzzy clustering-based granulation in existing techniques, this study develops a multi-round iterative optimization strategy with the use of Fuzzy C-Means (FCM) to enhance reconstruction performance for high-dimensional data. First, we propose a Feature Sampling-based FCM (FS-FCM) scheme served as the granulation mechanism in the framework. The proposed scheme draws on the idea of ensemble learning, where the granulation of original high-dimensional data is accomplished by generating and training low-dimensional sub-datasets through multiple times of feature random sampling. Then, a multi-round iterative granulation–degranulation mechanism is proposed along with its algorithmic framework. Within the proposed framework, we attempt to reduce the reconstruction error by iteratively reconstructing the residual data generated in each round of granulation–degranulation. Finally, we validate the developed strategy framework over twelve publicly available datasets with varying dimension scales. A set of ablation experiments verifies the effectiveness of the FS-FCM granulation scheme. The near-perfect reconstruction performance achieved by the proposed iterative framework on the given datasets further demonstrates its superiority.</p></div>\",\"PeriodicalId\":55130,\"journal\":{\"name\":\"Fuzzy Sets and Systems\",\"volume\":null,\"pages\":null},\"PeriodicalIF\":3.2000,\"publicationDate\":\"2024-04-04\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"0\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"Fuzzy Sets and Systems\",\"FirstCategoryId\":\"100\",\"ListUrlMain\":\"https://www.sciencedirect.com/science/article/pii/S0165011424001155\",\"RegionNum\":1,\"RegionCategory\":\"数学\",\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"Q2\",\"JCRName\":\"COMPUTER SCIENCE, THEORY & METHODS\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"Fuzzy Sets and Systems","FirstCategoryId":"100","ListUrlMain":"https://www.sciencedirect.com/science/article/pii/S0165011424001155","RegionNum":1,"RegionCategory":"数学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q2","JCRName":"COMPUTER SCIENCE, THEORY & METHODS","Score":null,"Total":0}

引用次数: 0

摘要

目前已开发出多种基于模糊聚类的粒化-解粒技术，用于构建和优化信息粒，这有助于揭示粒化计算（GrC）中实验数据的底层结构。基本上，性能良好的粒化-解粒机制运行时，解粒（重建）误差较低。然而，数据日益高维化的特点给实现高维数据的精确重建带来了巨大挑战。因此，对于高维数据的重建，一个重要的问题是如何降低重建误差，从而更准确地重建数据。为了解决数据维度增加带来的不可接受的高重建误差这一难题，并改善现有技术中基于模糊聚类的粒化效率低下的问题，本研究利用模糊 C-Means（FCM）开发了一种多轮迭代优化策略，以提高高维数据的重建性能。首先，我们提出了一种基于特征采样的 FCM（FS-FCM）方案，作为框架中的粒化机制。该方案借鉴了集合学习的思想，通过多次特征随机抽样生成并训练低维子数据集，从而完成对原始高维数据的粒化。然后，我们提出了一种多轮迭代粒化-解粒机制及其算法框架。在提出的框架内，我们试图通过迭代重建每轮粒化-解粒过程中产生的残余数据来减少重建误差。最后，我们在十二个公开的不同维度数据集上验证了所开发的策略框架。一组消融实验验证了 FS-FCM 粒化方案的有效性。所提出的迭代框架在给定数据集上实现的近乎完美的重建性能进一步证明了它的优越性。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

Augmentation of degranulation mechanism for high-dimensional data with a multi-round optimization strategy

查看原文本刊更多论文

Augmentation of degranulation mechanism for high-dimensional data with a multi-round optimization strategy

Various fuzzy clustering-based granulation–degranulation techniques have been developed for constructing and optimizing information granules, which help reveal the underlying structure of experimental data in Granular Computing (GrC). Basically, a well-performing granulation–degranulation mechanism runs with a low degranulation (reconstruction) error. However, the increasingly high-dimensional characteristics of data bring great challenges to achieve accurate of reconstruction of high-dimensional data. As such, for the reconstruction of high-dimensional data, an important issue is how to reduce the reconstruction error such that the data could be reconstructed more accurately. In order to address the challenge of unacceptable high reconstruction error posed by the increase in data dimensions and improve the inefficient fuzzy clustering-based granulation in existing techniques, this study develops a multi-round iterative optimization strategy with the use of Fuzzy C-Means (FCM) to enhance reconstruction performance for high-dimensional data. First, we propose a Feature Sampling-based FCM (FS-FCM) scheme served as the granulation mechanism in the framework. The proposed scheme draws on the idea of ensemble learning, where the granulation of original high-dimensional data is accomplished by generating and training low-dimensional sub-datasets through multiple times of feature random sampling. Then, a multi-round iterative granulation–degranulation mechanism is proposed along with its algorithmic framework. Within the proposed framework, we attempt to reduce the reconstruction error by iteratively reconstructing the residual data generated in each round of granulation–degranulation. Finally, we validate the developed strategy framework over twelve publicly available datasets with varying dimension scales. A set of ablation experiments verifies the effectiveness of the FS-FCM granulation scheme. The near-perfect reconstruction performance achieved by the proposed iterative framework on the given datasets further demonstrates its superiority.

求助全文

通过发布文献求助，成功后即可免费获取论文全文。去求助

来源期刊

Fuzzy Sets and Systems 数学-计算机：理论方法

CiteScore

6.50

自引率

17.90%

发文量

321

审稿时长

6.1 months

期刊介绍： Since its launching in 1978, the journal Fuzzy Sets and Systems has been devoted to the international advancement of the theory and application of fuzzy sets and systems. The theory of fuzzy sets now encompasses a well organized corpus of basic notions including (and not restricted to) aggregation operations, a generalized theory of relations, specific measures of information content, a calculus of fuzzy numbers. Fuzzy sets are also the cornerstone of a non-additive uncertainty theory, namely possibility theory, and of a versatile tool for both linguistic and numerical modeling: fuzzy rule-based systems. Numerous works now combine fuzzy concepts with other scientific disciplines as well as modern technologies. In mathematics fuzzy sets have triggered new research topics in connection with category theory, topology, algebra, analysis. Fuzzy sets are also part of a recent trend in the study of generalized measures and integrals, and are combined with statistical methods. Furthermore, fuzzy sets have strong logical underpinnings in the tradition of many-valued logics.