基于Q-GEV的可训练聚类算法降低数据聚类复杂度

IF 2.3 4区计算机科学 Q2 COMPUTER SCIENCE, ARTIFICIAL INTELLIGENCE

Expert Systems Pub Date : 2025-02-27 DOI:10.1111/exsy.70011

Mohamed Abd Elaziz, Esraa Osama Abo Zaid, Mohammed A. A. Al-qaness, Amjad Ali, Ali Kashif Bashir, Ahmed A. Ewees, Yasser D. Al-Otaibi, Ala Al-Fuqaha

{"title":"基于Q-GEV的可训练聚类算法降低数据聚类复杂度","authors":"Mohamed Abd Elaziz, Esraa Osama Abo Zaid, Mohammed A. A. Al-qaness, Amjad Ali, Ali Kashif Bashir, Ahmed A. Ewees, Yasser D. Al-Otaibi, Ala Al-Fuqaha","doi":"10.1111/exsy.70011","DOIUrl":null,"url":null,"abstract":"<div>\n \n <p>This paper presents a new data clustering technique aimed at enhancing the performance of the trainable path-cost algorithm and reducing the computational complexity of data clustering models. The proposed method facilitates the discovery of natural groupings and behaviours, which is crucial for effective coordination in complex environments. It identifies natural groupings within a set of features and detects the best clusters with similar behaviour in the data, overcoming the limitations of traditional state-of-the-art methods. The algorithm utilises a density peak clustering method to determine cluster centers and then extracts features from paths passing through these peak points (centers). These features are used to train the support vector machine (SVM) to predict the labels of other points. The proposed algorithm is enhanced using two key concepts: first, it employs Q-Generalised Extreme Value (Q-GEV) under power normalisation instead of traditional generalised extreme value distributions, thereby increasing modelling flexibility; second, it utilises the random vector functional link (RVFL) network rather than the SVM, which helps avoid overfitting and improves label prediction accuracy. The effectiveness of the proposed clustering algorithm is evaluated through various experiments, including those on UCI benchmark datasets and real-world data, demonstrating significant improvements across multiple performance metrics, including F1 measure, Jaccard index, purity, and accuracy, highlighting its capability in accurately identifying paths between similar clusters. Its average F1 measure, Jaccard index, purity, and accuracy is measured 76.87%, 56.29%, 80.29%, and 79.64%, respectively.</p>\n </div>","PeriodicalId":51053,"journal":{"name":"Expert Systems","volume":"42 4","pages":""},"PeriodicalIF":2.3000,"publicationDate":"2025-02-27","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":"{\"title\":\"Q-GEV Based Novel Trainable Clustering Scheme for Reducing Complexity of Data Clustering\",\"authors\":\"Mohamed Abd Elaziz, Esraa Osama Abo Zaid, Mohammed A. A. Al-qaness, Amjad Ali, Ali Kashif Bashir, Ahmed A. Ewees, Yasser D. Al-Otaibi, Ala Al-Fuqaha\",\"doi\":\"10.1111/exsy.70011\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"<div>\\n \\n <p>This paper presents a new data clustering technique aimed at enhancing the performance of the trainable path-cost algorithm and reducing the computational complexity of data clustering models. The proposed method facilitates the discovery of natural groupings and behaviours, which is crucial for effective coordination in complex environments. It identifies natural groupings within a set of features and detects the best clusters with similar behaviour in the data, overcoming the limitations of traditional state-of-the-art methods. The algorithm utilises a density peak clustering method to determine cluster centers and then extracts features from paths passing through these peak points (centers). These features are used to train the support vector machine (SVM) to predict the labels of other points. The proposed algorithm is enhanced using two key concepts: first, it employs Q-Generalised Extreme Value (Q-GEV) under power normalisation instead of traditional generalised extreme value distributions, thereby increasing modelling flexibility; second, it utilises the random vector functional link (RVFL) network rather than the SVM, which helps avoid overfitting and improves label prediction accuracy. The effectiveness of the proposed clustering algorithm is evaluated through various experiments, including those on UCI benchmark datasets and real-world data, demonstrating significant improvements across multiple performance metrics, including F1 measure, Jaccard index, purity, and accuracy, highlighting its capability in accurately identifying paths between similar clusters. Its average F1 measure, Jaccard index, purity, and accuracy is measured 76.87%, 56.29%, 80.29%, and 79.64%, respectively.</p>\\n </div>\",\"PeriodicalId\":51053,\"journal\":{\"name\":\"Expert Systems\",\"volume\":\"42 4\",\"pages\":\"\"},\"PeriodicalIF\":2.3000,\"publicationDate\":\"2025-02-27\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"0\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"Expert Systems\",\"FirstCategoryId\":\"94\",\"ListUrlMain\":\"https://onlinelibrary.wiley.com/doi/10.1111/exsy.70011\",\"RegionNum\":4,\"RegionCategory\":\"计算机科学\",\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"Q2\",\"JCRName\":\"COMPUTER SCIENCE, ARTIFICIAL INTELLIGENCE\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"Expert Systems","FirstCategoryId":"94","ListUrlMain":"https://onlinelibrary.wiley.com/doi/10.1111/exsy.70011","RegionNum":4,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q2","JCRName":"COMPUTER SCIENCE, ARTIFICIAL INTELLIGENCE","Score":null,"Total":0}

引用次数: 0

摘要

本文提出了一种新的数据聚类技术，旨在提高可训练路径代价算法的性能，降低数据聚类模型的计算复杂度。提出的方法有助于发现自然分组和行为，这对于在复杂环境中进行有效协调至关重要。它在一组特征中识别自然分组，并检测数据中具有相似行为的最佳聚类，克服了传统最先进方法的局限性。该算法利用密度峰值聚类方法确定聚类中心，然后从经过这些峰值点（中心）的路径中提取特征。这些特征被用来训练支持向量机（SVM）来预测其他点的标签。本文提出的算法使用两个关键概念进行增强：首先，它采用功率归一化下的q -广义极值（Q-GEV）而不是传统的广义极值分布，从而增加了建模的灵活性；其次，利用随机向量功能链接（RVFL）网络代替支持向量机，避免了过拟合，提高了标签预测精度。通过各种实验（包括UCI基准数据集和真实数据）评估了所提出的聚类算法的有效性，证明了在多个性能指标（包括F1度量、Jaccard指数、纯度和准确性）上的显著改进，突出了其准确识别相似聚类之间路径的能力。其平均F1度、Jaccard指数、纯度和准确度分别为76.87%、56.29%、80.29%和79.64%。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

查看原文本刊更多论文

Q-GEV Based Novel Trainable Clustering Scheme for Reducing Complexity of Data Clustering

This paper presents a new data clustering technique aimed at enhancing the performance of the trainable path-cost algorithm and reducing the computational complexity of data clustering models. The proposed method facilitates the discovery of natural groupings and behaviours, which is crucial for effective coordination in complex environments. It identifies natural groupings within a set of features and detects the best clusters with similar behaviour in the data, overcoming the limitations of traditional state-of-the-art methods. The algorithm utilises a density peak clustering method to determine cluster centers and then extracts features from paths passing through these peak points (centers). These features are used to train the support vector machine (SVM) to predict the labels of other points. The proposed algorithm is enhanced using two key concepts: first, it employs Q-Generalised Extreme Value (Q-GEV) under power normalisation instead of traditional generalised extreme value distributions, thereby increasing modelling flexibility; second, it utilises the random vector functional link (RVFL) network rather than the SVM, which helps avoid overfitting and improves label prediction accuracy. The effectiveness of the proposed clustering algorithm is evaluated through various experiments, including those on UCI benchmark datasets and real-world data, demonstrating significant improvements across multiple performance metrics, including F1 measure, Jaccard index, purity, and accuracy, highlighting its capability in accurately identifying paths between similar clusters. Its average F1 measure, Jaccard index, purity, and accuracy is measured 76.87%, 56.29%, 80.29%, and 79.64%, respectively.

求助全文

通过发布文献求助，成功后即可免费获取论文全文。去求助

来源期刊

Expert Systems 工程技术-计算机：理论方法

CiteScore

7.40

自引率

6.10%

发文量

266

审稿时长

24 months

期刊介绍： Expert Systems: The Journal of Knowledge Engineering publishes papers dealing with all aspects of knowledge engineering, including individual methods and techniques in knowledge acquisition and representation, and their application in the construction of systems – including expert systems – based thereon. Detailed scientific evaluation is an essential part of any paper. As well as traditional application areas, such as Software and Requirements Engineering, Human-Computer Interaction, and Artificial Intelligence, we are aiming at the new and growing markets for these technologies, such as Business, Economy, Market Research, and Medical and Health Care. The shift towards this new focus will be marked by a series of special issues covering hot and emergent topics.