AHDPC：自适应双曲密度峰值聚类

IF 7.5 1区计算机科学 Q1 COMPUTER SCIENCE, ARTIFICIAL INTELLIGENCE

Expert Systems with Applications Pub Date : 2025-10-19 DOI:10.1016/j.eswa.2025.130065

Jinglong Wang , Yu Zhang , Changju Liu , Jiangtao Xu

{"title":"AHDPC：自适应双曲密度峰值聚类","authors":"Jinglong Wang , Yu Zhang , Changju Liu , Jiangtao Xu","doi":"10.1016/j.eswa.2025.130065","DOIUrl":null,"url":null,"abstract":"<div><div>Non-uniformly distributed datasets are common in real-world, and density peak clustering (DPC) methods are used on these datasets due to their superior clustering performance. However, existing DPC relies on linearly growing Euclidean distance, causing misleading similarity between points from different clusters and limiting the improvement of accuracy. To overcome this limitation, this study introduces an adaptive hyperbolic density peak clustering algorithm (AHDPC) by extending DPC into hyperbolic space. First, linear Euclidean distance is replaced with exponentially growing hyperbolic distance to enhance density difference between different points. Then, to overcome the misclassification of points at the junction of high-density and low-density regions and errors from extreme hyperbolic distance, a novel adaptive weighting strategy is proposed, it dynamically adjusts hyperbolic distance by building the trace of the global covariance matrix, the Euclidean norm, the maximum pairwise distance, and point-to-center deviation. Finally, an adaptively cutoff distance method based on a segmented search strategy is developed to eliminate manual tuning, and an exponential density function replaces the gaussian kernel to improve computational efficiency. AHDPC not only overcomes the deficiencies of Euclidean space but also mitigates the restrictive aspects of hyperbolic space. Extensive experiments on synthetic and real datasets, the olivetti faces dataset, and medical image datasets demonstrate that AHDPC outperforms state-of-the-art methods in clustering accuracy. Results also show that AHDPC produces more discriminative decision graph for identifying cluster centers and enhances the accuracy of categorisation of non-center points. The advantages of its robustness and adaptive weight in improving the clustering performance are also confirmed.</div></div>","PeriodicalId":50461,"journal":{"name":"Expert Systems with Applications","volume":"299 ","pages":"Article 130065"},"PeriodicalIF":7.5000,"publicationDate":"2025-10-19","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":"{\"title\":\"AHDPC: Adaptively hyperbolic density peak clustering\",\"authors\":\"Jinglong Wang , Yu Zhang , Changju Liu , Jiangtao Xu\",\"doi\":\"10.1016/j.eswa.2025.130065\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"<div><div>Non-uniformly distributed datasets are common in real-world, and density peak clustering (DPC) methods are used on these datasets due to their superior clustering performance. However, existing DPC relies on linearly growing Euclidean distance, causing misleading similarity between points from different clusters and limiting the improvement of accuracy. To overcome this limitation, this study introduces an adaptive hyperbolic density peak clustering algorithm (AHDPC) by extending DPC into hyperbolic space. First, linear Euclidean distance is replaced with exponentially growing hyperbolic distance to enhance density difference between different points. Then, to overcome the misclassification of points at the junction of high-density and low-density regions and errors from extreme hyperbolic distance, a novel adaptive weighting strategy is proposed, it dynamically adjusts hyperbolic distance by building the trace of the global covariance matrix, the Euclidean norm, the maximum pairwise distance, and point-to-center deviation. Finally, an adaptively cutoff distance method based on a segmented search strategy is developed to eliminate manual tuning, and an exponential density function replaces the gaussian kernel to improve computational efficiency. AHDPC not only overcomes the deficiencies of Euclidean space but also mitigates the restrictive aspects of hyperbolic space. Extensive experiments on synthetic and real datasets, the olivetti faces dataset, and medical image datasets demonstrate that AHDPC outperforms state-of-the-art methods in clustering accuracy. Results also show that AHDPC produces more discriminative decision graph for identifying cluster centers and enhances the accuracy of categorisation of non-center points. The advantages of its robustness and adaptive weight in improving the clustering performance are also confirmed.</div></div>\",\"PeriodicalId\":50461,\"journal\":{\"name\":\"Expert Systems with Applications\",\"volume\":\"299 \",\"pages\":\"Article 130065\"},\"PeriodicalIF\":7.5000,\"publicationDate\":\"2025-10-19\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"0\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"Expert Systems with Applications\",\"FirstCategoryId\":\"94\",\"ListUrlMain\":\"https://www.sciencedirect.com/science/article/pii/S0957417425036814\",\"RegionNum\":1,\"RegionCategory\":\"计算机科学\",\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"Q1\",\"JCRName\":\"COMPUTER SCIENCE, ARTIFICIAL INTELLIGENCE\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"Expert Systems with Applications","FirstCategoryId":"94","ListUrlMain":"https://www.sciencedirect.com/science/article/pii/S0957417425036814","RegionNum":1,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q1","JCRName":"COMPUTER SCIENCE, ARTIFICIAL INTELLIGENCE","Score":null,"Total":0}

引用次数: 0

摘要

非均匀分布的数据集在现实生活中很常见，密度峰值聚类（DPC）方法由于其优越的聚类性能而被用于这些数据集。然而，现有的DPC依赖于线性增长的欧氏距离，导致不同聚类点之间的相似度产生误导，限制了精度的提高。为了克服这一局限性，本文将自适应双曲密度峰值聚类算法扩展到双曲空间，引入了自适应双曲密度峰值聚类算法（AHDPC）。首先，用指数增长的双曲距离代替线性欧氏距离，增强点间密度差；然后，为了克服高密度和低密度区域交接处点的误分类和极端双曲距离误差，提出了一种新的自适应加权策略，该策略通过建立全局协方差矩阵的迹线、欧几里得范数、最大成对距离和点中心偏差来动态调整双曲距离。最后，提出了一种基于分段搜索策略的自适应截断距离方法，消除了人工调优，并用指数密度函数代替高斯核来提高计算效率。AHDPC不仅克服了欧几里得空间的不足，而且减轻了双曲空间的局限性。在合成和真实数据集、olivetti面部数据集和医学图像数据集上进行的大量实验表明，AHDPC在聚类精度方面优于最先进的方法。结果还表明，AHDPC在识别聚类中心方面产生了更强的判别性决策图，并提高了非中心点的分类精度。验证了其鲁棒性和自适应权值在提高聚类性能方面的优势。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

查看原文本刊更多论文

AHDPC: Adaptively hyperbolic density peak clustering

Non-uniformly distributed datasets are common in real-world, and density peak clustering (DPC) methods are used on these datasets due to their superior clustering performance. However, existing DPC relies on linearly growing Euclidean distance, causing misleading similarity between points from different clusters and limiting the improvement of accuracy. To overcome this limitation, this study introduces an adaptive hyperbolic density peak clustering algorithm (AHDPC) by extending DPC into hyperbolic space. First, linear Euclidean distance is replaced with exponentially growing hyperbolic distance to enhance density difference between different points. Then, to overcome the misclassification of points at the junction of high-density and low-density regions and errors from extreme hyperbolic distance, a novel adaptive weighting strategy is proposed, it dynamically adjusts hyperbolic distance by building the trace of the global covariance matrix, the Euclidean norm, the maximum pairwise distance, and point-to-center deviation. Finally, an adaptively cutoff distance method based on a segmented search strategy is developed to eliminate manual tuning, and an exponential density function replaces the gaussian kernel to improve computational efficiency. AHDPC not only overcomes the deficiencies of Euclidean space but also mitigates the restrictive aspects of hyperbolic space. Extensive experiments on synthetic and real datasets, the olivetti faces dataset, and medical image datasets demonstrate that AHDPC outperforms state-of-the-art methods in clustering accuracy. Results also show that AHDPC produces more discriminative decision graph for identifying cluster centers and enhances the accuracy of categorisation of non-center points. The advantages of its robustness and adaptive weight in improving the clustering performance are also confirmed.

求助全文

通过发布文献求助，成功后即可免费获取论文全文。去求助

来源期刊

Expert Systems with Applications 工程技术-工程：电子与电气

CiteScore

13.80

自引率

10.60%

发文量

2045

审稿时长

8.7 months

期刊介绍： Expert Systems With Applications is an international journal dedicated to the exchange of information on expert and intelligent systems used globally in industry, government, and universities. The journal emphasizes original papers covering the design, development, testing, implementation, and management of these systems, offering practical guidelines. It spans various sectors such as finance, engineering, marketing, law, project management, information management, medicine, and more. The journal also welcomes papers on multi-agent systems, knowledge management, neural networks, knowledge discovery, data mining, and other related areas, excluding applications to military/defense systems.