{"title":"用多轮优化策略增强高维数据的脱粒机制","authors":"Xiaoan Tang , Mingsong Duan , Kaijie Xu , Qiang Zhang","doi":"10.1016/j.fss.2024.108969","DOIUrl":null,"url":null,"abstract":"<div><p>Various fuzzy clustering-based granulation–degranulation techniques have been developed for constructing and optimizing information granules, which help reveal the underlying structure of experimental data in Granular Computing (GrC). Basically, a well-performing granulation–degranulation mechanism runs with a low degranulation (reconstruction) error. However, the increasingly high-dimensional characteristics of data bring great challenges to achieve accurate of reconstruction of high-dimensional data. As such, for the reconstruction of high-dimensional data, an important issue is how to reduce the reconstruction error such that the data could be reconstructed more accurately. In order to address the challenge of unacceptable high reconstruction error posed by the increase in data dimensions and improve the inefficient fuzzy clustering-based granulation in existing techniques, this study develops a multi-round iterative optimization strategy with the use of Fuzzy C-Means (FCM) to enhance reconstruction performance for high-dimensional data. First, we propose a Feature Sampling-based FCM (FS-FCM) scheme served as the granulation mechanism in the framework. The proposed scheme draws on the idea of ensemble learning, where the granulation of original high-dimensional data is accomplished by generating and training low-dimensional sub-datasets through multiple times of feature random sampling. Then, a multi-round iterative granulation–degranulation mechanism is proposed along with its algorithmic framework. Within the proposed framework, we attempt to reduce the reconstruction error by iteratively reconstructing the residual data generated in each round of granulation–degranulation. Finally, we validate the developed strategy framework over twelve publicly available datasets with varying dimension scales. A set of ablation experiments verifies the effectiveness of the FS-FCM granulation scheme. The near-perfect reconstruction performance achieved by the proposed iterative framework on the given datasets further demonstrates its superiority.</p></div>","PeriodicalId":55130,"journal":{"name":"Fuzzy Sets and Systems","volume":null,"pages":null},"PeriodicalIF":3.2000,"publicationDate":"2024-04-04","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":"{\"title\":\"Augmentation of degranulation mechanism for high-dimensional data with a multi-round optimization strategy\",\"authors\":\"Xiaoan Tang , Mingsong Duan , Kaijie Xu , Qiang Zhang\",\"doi\":\"10.1016/j.fss.2024.108969\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"<div><p>Various fuzzy clustering-based granulation–degranulation techniques have been developed for constructing and optimizing information granules, which help reveal the underlying structure of experimental data in Granular Computing (GrC). Basically, a well-performing granulation–degranulation mechanism runs with a low degranulation (reconstruction) error. However, the increasingly high-dimensional characteristics of data bring great challenges to achieve accurate of reconstruction of high-dimensional data. As such, for the reconstruction of high-dimensional data, an important issue is how to reduce the reconstruction error such that the data could be reconstructed more accurately. In order to address the challenge of unacceptable high reconstruction error posed by the increase in data dimensions and improve the inefficient fuzzy clustering-based granulation in existing techniques, this study develops a multi-round iterative optimization strategy with the use of Fuzzy C-Means (FCM) to enhance reconstruction performance for high-dimensional data. First, we propose a Feature Sampling-based FCM (FS-FCM) scheme served as the granulation mechanism in the framework. The proposed scheme draws on the idea of ensemble learning, where the granulation of original high-dimensional data is accomplished by generating and training low-dimensional sub-datasets through multiple times of feature random sampling. Then, a multi-round iterative granulation–degranulation mechanism is proposed along with its algorithmic framework. Within the proposed framework, we attempt to reduce the reconstruction error by iteratively reconstructing the residual data generated in each round of granulation–degranulation. Finally, we validate the developed strategy framework over twelve publicly available datasets with varying dimension scales. A set of ablation experiments verifies the effectiveness of the FS-FCM granulation scheme. The near-perfect reconstruction performance achieved by the proposed iterative framework on the given datasets further demonstrates its superiority.</p></div>\",\"PeriodicalId\":55130,\"journal\":{\"name\":\"Fuzzy Sets and Systems\",\"volume\":null,\"pages\":null},\"PeriodicalIF\":3.2000,\"publicationDate\":\"2024-04-04\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"0\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"Fuzzy Sets and Systems\",\"FirstCategoryId\":\"100\",\"ListUrlMain\":\"https://www.sciencedirect.com/science/article/pii/S0165011424001155\",\"RegionNum\":1,\"RegionCategory\":\"数学\",\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"Q2\",\"JCRName\":\"COMPUTER SCIENCE, THEORY & METHODS\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"Fuzzy Sets and Systems","FirstCategoryId":"100","ListUrlMain":"https://www.sciencedirect.com/science/article/pii/S0165011424001155","RegionNum":1,"RegionCategory":"数学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q2","JCRName":"COMPUTER SCIENCE, THEORY & METHODS","Score":null,"Total":0}
Augmentation of degranulation mechanism for high-dimensional data with a multi-round optimization strategy
Various fuzzy clustering-based granulation–degranulation techniques have been developed for constructing and optimizing information granules, which help reveal the underlying structure of experimental data in Granular Computing (GrC). Basically, a well-performing granulation–degranulation mechanism runs with a low degranulation (reconstruction) error. However, the increasingly high-dimensional characteristics of data bring great challenges to achieve accurate of reconstruction of high-dimensional data. As such, for the reconstruction of high-dimensional data, an important issue is how to reduce the reconstruction error such that the data could be reconstructed more accurately. In order to address the challenge of unacceptable high reconstruction error posed by the increase in data dimensions and improve the inefficient fuzzy clustering-based granulation in existing techniques, this study develops a multi-round iterative optimization strategy with the use of Fuzzy C-Means (FCM) to enhance reconstruction performance for high-dimensional data. First, we propose a Feature Sampling-based FCM (FS-FCM) scheme served as the granulation mechanism in the framework. The proposed scheme draws on the idea of ensemble learning, where the granulation of original high-dimensional data is accomplished by generating and training low-dimensional sub-datasets through multiple times of feature random sampling. Then, a multi-round iterative granulation–degranulation mechanism is proposed along with its algorithmic framework. Within the proposed framework, we attempt to reduce the reconstruction error by iteratively reconstructing the residual data generated in each round of granulation–degranulation. Finally, we validate the developed strategy framework over twelve publicly available datasets with varying dimension scales. A set of ablation experiments verifies the effectiveness of the FS-FCM granulation scheme. The near-perfect reconstruction performance achieved by the proposed iterative framework on the given datasets further demonstrates its superiority.
期刊介绍:
Since its launching in 1978, the journal Fuzzy Sets and Systems has been devoted to the international advancement of the theory and application of fuzzy sets and systems. The theory of fuzzy sets now encompasses a well organized corpus of basic notions including (and not restricted to) aggregation operations, a generalized theory of relations, specific measures of information content, a calculus of fuzzy numbers. Fuzzy sets are also the cornerstone of a non-additive uncertainty theory, namely possibility theory, and of a versatile tool for both linguistic and numerical modeling: fuzzy rule-based systems. Numerous works now combine fuzzy concepts with other scientific disciplines as well as modern technologies.
In mathematics fuzzy sets have triggered new research topics in connection with category theory, topology, algebra, analysis. Fuzzy sets are also part of a recent trend in the study of generalized measures and integrals, and are combined with statistical methods. Furthermore, fuzzy sets have strong logical underpinnings in the tradition of many-valued logics.