Omkaresh Kulkarni , Chitrakant Banchhor , V. Ravi Sankar
{"title":"白鲨甲虫优化器在MapReduce框架中实现了深度模糊聚类,用于特征选择和大数据聚类","authors":"Omkaresh Kulkarni , Chitrakant Banchhor , V. Ravi Sankar","doi":"10.1016/j.fss.2025.109536","DOIUrl":null,"url":null,"abstract":"<div><div>Big data analytics have gained substantial attention over traditional data-processing methods, as they excel in uncovering hidden patterns and correlations within massive datasets, commonly referred to as big data. Advancements in information technology and the rapid expansion of the web have significantly increased the volume of data generated and utilized in everyday life. Moreover, traditional methods often struggle with efficiency and accuracy. These challenges are crucial to address, as the era of big data is transforming various domains, from research to real-world applications, where accurate analysis is critical. This study introduces White Shark Beetle Optimizer + Deep Fuzzy Clustering (WSBO+DFC), an innovative method designed to efficiently process and analyze big data. At the beginning, the input big data is retrieved from the database and transmitted to the MapReduce framework for processing. The MapReduce architecture has two phases, namely the mapper phase and the reducer phase. In the mapper phase, key-value pairs are generated from the dataset, providing structure to the previously unstructured data. This phase consists of multiple mappers, where feature selection is performed using Support Vector Machine (SVM) and Recursive Feature Elimination (SVM-RFE). To optimize the weight parameters of SVM, the proposed White Shark Beetle Optimizer (WSBO) is employed. Alternatively, in a reduced phase, the entire selected features are merged. After that, the fused features are subjected to big data clustering, which is conducted by utilizing Deep Fuzzy Clustering (DFC). The weight update process within DFC is guided by the WSBO, which is developed through the integration of the White Shark Optimizer (WSO) and the Dung Beetle Optimizer (DBO). The developed method achieved a Maximum accuracy of 89.87% and a maximum DB Index of 0.995.</div></div>","PeriodicalId":55130,"journal":{"name":"Fuzzy Sets and Systems","volume":"519 ","pages":"Article 109536"},"PeriodicalIF":2.7000,"publicationDate":"2025-07-05","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":"{\"title\":\"White shark beetle optimizer enabled deep fuzzy clustering for feature selection and big data clustering in MapReduce framework\",\"authors\":\"Omkaresh Kulkarni , Chitrakant Banchhor , V. Ravi Sankar\",\"doi\":\"10.1016/j.fss.2025.109536\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"<div><div>Big data analytics have gained substantial attention over traditional data-processing methods, as they excel in uncovering hidden patterns and correlations within massive datasets, commonly referred to as big data. Advancements in information technology and the rapid expansion of the web have significantly increased the volume of data generated and utilized in everyday life. Moreover, traditional methods often struggle with efficiency and accuracy. These challenges are crucial to address, as the era of big data is transforming various domains, from research to real-world applications, where accurate analysis is critical. This study introduces White Shark Beetle Optimizer + Deep Fuzzy Clustering (WSBO+DFC), an innovative method designed to efficiently process and analyze big data. At the beginning, the input big data is retrieved from the database and transmitted to the MapReduce framework for processing. The MapReduce architecture has two phases, namely the mapper phase and the reducer phase. In the mapper phase, key-value pairs are generated from the dataset, providing structure to the previously unstructured data. This phase consists of multiple mappers, where feature selection is performed using Support Vector Machine (SVM) and Recursive Feature Elimination (SVM-RFE). To optimize the weight parameters of SVM, the proposed White Shark Beetle Optimizer (WSBO) is employed. Alternatively, in a reduced phase, the entire selected features are merged. After that, the fused features are subjected to big data clustering, which is conducted by utilizing Deep Fuzzy Clustering (DFC). The weight update process within DFC is guided by the WSBO, which is developed through the integration of the White Shark Optimizer (WSO) and the Dung Beetle Optimizer (DBO). The developed method achieved a Maximum accuracy of 89.87% and a maximum DB Index of 0.995.</div></div>\",\"PeriodicalId\":55130,\"journal\":{\"name\":\"Fuzzy Sets and Systems\",\"volume\":\"519 \",\"pages\":\"Article 109536\"},\"PeriodicalIF\":2.7000,\"publicationDate\":\"2025-07-05\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"0\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"Fuzzy Sets and Systems\",\"FirstCategoryId\":\"100\",\"ListUrlMain\":\"https://www.sciencedirect.com/science/article/pii/S0165011425002751\",\"RegionNum\":1,\"RegionCategory\":\"数学\",\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"Q2\",\"JCRName\":\"COMPUTER SCIENCE, THEORY & METHODS\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"Fuzzy Sets and Systems","FirstCategoryId":"100","ListUrlMain":"https://www.sciencedirect.com/science/article/pii/S0165011425002751","RegionNum":1,"RegionCategory":"数学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q2","JCRName":"COMPUTER SCIENCE, THEORY & METHODS","Score":null,"Total":0}
White shark beetle optimizer enabled deep fuzzy clustering for feature selection and big data clustering in MapReduce framework
Big data analytics have gained substantial attention over traditional data-processing methods, as they excel in uncovering hidden patterns and correlations within massive datasets, commonly referred to as big data. Advancements in information technology and the rapid expansion of the web have significantly increased the volume of data generated and utilized in everyday life. Moreover, traditional methods often struggle with efficiency and accuracy. These challenges are crucial to address, as the era of big data is transforming various domains, from research to real-world applications, where accurate analysis is critical. This study introduces White Shark Beetle Optimizer + Deep Fuzzy Clustering (WSBO+DFC), an innovative method designed to efficiently process and analyze big data. At the beginning, the input big data is retrieved from the database and transmitted to the MapReduce framework for processing. The MapReduce architecture has two phases, namely the mapper phase and the reducer phase. In the mapper phase, key-value pairs are generated from the dataset, providing structure to the previously unstructured data. This phase consists of multiple mappers, where feature selection is performed using Support Vector Machine (SVM) and Recursive Feature Elimination (SVM-RFE). To optimize the weight parameters of SVM, the proposed White Shark Beetle Optimizer (WSBO) is employed. Alternatively, in a reduced phase, the entire selected features are merged. After that, the fused features are subjected to big data clustering, which is conducted by utilizing Deep Fuzzy Clustering (DFC). The weight update process within DFC is guided by the WSBO, which is developed through the integration of the White Shark Optimizer (WSO) and the Dung Beetle Optimizer (DBO). The developed method achieved a Maximum accuracy of 89.87% and a maximum DB Index of 0.995.
期刊介绍:
Since its launching in 1978, the journal Fuzzy Sets and Systems has been devoted to the international advancement of the theory and application of fuzzy sets and systems. The theory of fuzzy sets now encompasses a well organized corpus of basic notions including (and not restricted to) aggregation operations, a generalized theory of relations, specific measures of information content, a calculus of fuzzy numbers. Fuzzy sets are also the cornerstone of a non-additive uncertainty theory, namely possibility theory, and of a versatile tool for both linguistic and numerical modeling: fuzzy rule-based systems. Numerous works now combine fuzzy concepts with other scientific disciplines as well as modern technologies.
In mathematics fuzzy sets have triggered new research topics in connection with category theory, topology, algebra, analysis. Fuzzy sets are also part of a recent trend in the study of generalized measures and integrals, and are combined with statistical methods. Furthermore, fuzzy sets have strong logical underpinnings in the tradition of many-valued logics.