DCF: An Efficient and Robust Density-Based Clustering Method

Joshua Tobin, Mimi Zhang
{"title":"DCF: An Efficient and Robust Density-Based Clustering Method","authors":"Joshua Tobin, Mimi Zhang","doi":"10.1109/ICDM51629.2021.00074","DOIUrl":null,"url":null,"abstract":"Density-based clustering methods have been shown to achieve promising results in modern data mining applications. A recent approach, Density Peaks Clustering (DPC), detects modes as points with high density and large distance to points of higher density, and hence often fails to detect low-density clusters in the data. Furthermore, DPC has quadratic complexity. We here develop a new clustering algorithm, aiming at improving the applicability and efficiency of the peak-finding technique. The improvements are threefold: (1) the new algorithm is applicable to large datasets; (2) the algorithm is capable of detecting clusters of varying density; (3) the algorithm is competent at deciding the correct number of clusters, even when the number of clusters is very high. The clustering performance of the algorithm is greatly enhanced by directing the peak-finding technique to discover modal sets, rather than point modes. We present a theoretical analysis of our approach and experimental results to verify that our algorithm works well in practice. We demonstrate a potential application of our work for unsupervised face recognition.","PeriodicalId":320970,"journal":{"name":"2021 IEEE International Conference on Data Mining (ICDM)","volume":"682 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2021-12-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"6","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"2021 IEEE International Conference on Data Mining (ICDM)","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/ICDM51629.2021.00074","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 6

Abstract

Density-based clustering methods have been shown to achieve promising results in modern data mining applications. A recent approach, Density Peaks Clustering (DPC), detects modes as points with high density and large distance to points of higher density, and hence often fails to detect low-density clusters in the data. Furthermore, DPC has quadratic complexity. We here develop a new clustering algorithm, aiming at improving the applicability and efficiency of the peak-finding technique. The improvements are threefold: (1) the new algorithm is applicable to large datasets; (2) the algorithm is capable of detecting clusters of varying density; (3) the algorithm is competent at deciding the correct number of clusters, even when the number of clusters is very high. The clustering performance of the algorithm is greatly enhanced by directing the peak-finding technique to discover modal sets, rather than point modes. We present a theoretical analysis of our approach and experimental results to verify that our algorithm works well in practice. We demonstrate a potential application of our work for unsupervised face recognition.
一种高效鲁棒的基于密度的聚类方法
基于密度的聚类方法在现代数据挖掘应用中取得了很好的效果。最近的一种方法,密度峰聚类(DPC),将模式检测为高密度点和到高密度点的大距离,因此经常无法检测到数据中的低密度聚类。此外,DPC具有二次复杂度。本文提出了一种新的聚类算法,旨在提高寻峰技术的适用性和效率。改进有三个方面:(1)新算法适用于大数据集;(2)算法能够检测不同密度的聚类;(3)即使在簇数非常高的情况下,该算法也能确定正确的簇数。通过将峰值查找技术用于发现模态集而不是点模态,大大提高了算法的聚类性能。我们给出了理论分析和实验结果,以验证我们的算法在实践中是有效的。我们展示了我们的工作在无监督人脸识别方面的潜在应用。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
求助全文
约1分钟内获得全文 求助全文
来源期刊
自引率
0.00%
发文量
0
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
确定
请完成安全验证×
copy
已复制链接
快去分享给好友吧!
我知道了
右上角分享
点击右上角分享
0
联系我们:info@booksci.cn Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。 Copyright © 2023 布克学术 All rights reserved.
京ICP备2023020795号-1
ghs 京公网安备 11010802042870号
Book学术文献互助
Book学术文献互助群
群 号:604180095
Book学术官方微信