CURE-NS: a hierarchical clustering algorithm with new shrinking scheme

Y. Qian, Qingzhang Shi, Qi Wang
{"title":"CURE-NS: a hierarchical clustering algorithm with new shrinking scheme","authors":"Y. Qian, Qingzhang Shi, Qi Wang","doi":"10.1109/ICMLC.2002.1174512","DOIUrl":null,"url":null,"abstract":"CURE (clustering using representatives) is an efficient clustering algorithm for large databases, which is more robust to outliers compared with other clustering methods, and identifies clusters having non-spherical shapes and wide variances in size. CURE employs a fixed number or representative points to describe the cluster, and the set of representative points are first chosen randomly, and then are shrunk toward the mean of cluster. The shrinking operation plays a key role in CURE, which is used for weakening the effect of outliers. However, we found that the shrinking scheme of CURE is dependent on a hidden assumption of spherical shape of cluster, therefore CURE has difficulties in dealing with databases having specific shapes. In this paper, CURE-NS (CURE with new shrinking scheme) is proposed to overcome this problem, which uses the difference of density values of the representative points to determine the direction and distance of shrinking. Our shrinking scheme has nothing to do with the shape of cluster. A range of experiments demonstrate that CURE-NS has better clustering performance than CURE.","PeriodicalId":90702,"journal":{"name":"Proceedings. International Conference on Machine Learning and Cybernetics","volume":"12 1","pages":"895-899 vol.2"},"PeriodicalIF":0.0000,"publicationDate":"2002-11-04","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"15","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Proceedings. International Conference on Machine Learning and Cybernetics","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/ICMLC.2002.1174512","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 15

Abstract

CURE (clustering using representatives) is an efficient clustering algorithm for large databases, which is more robust to outliers compared with other clustering methods, and identifies clusters having non-spherical shapes and wide variances in size. CURE employs a fixed number or representative points to describe the cluster, and the set of representative points are first chosen randomly, and then are shrunk toward the mean of cluster. The shrinking operation plays a key role in CURE, which is used for weakening the effect of outliers. However, we found that the shrinking scheme of CURE is dependent on a hidden assumption of spherical shape of cluster, therefore CURE has difficulties in dealing with databases having specific shapes. In this paper, CURE-NS (CURE with new shrinking scheme) is proposed to overcome this problem, which uses the difference of density values of the representative points to determine the direction and distance of shrinking. Our shrinking scheme has nothing to do with the shape of cluster. A range of experiments demonstrate that CURE-NS has better clustering performance than CURE.
CURE-NS:一种具有新收缩方案的分层聚类算法
CURE (clustering using representative)是一种高效的大型数据库聚类算法,与其他聚类方法相比,它对离群值的鲁棒性更强,能够识别形状非球形且大小差异较大的聚类。CURE采用固定数量或代表性点来描述聚类,首先随机选取代表性点集,然后向聚类均值缩小。缩小操作在CURE中起着关键作用,它用于减弱离群值的影响。然而,我们发现CURE的收缩方案依赖于一个隐藏的假设,即簇的球形,因此CURE在处理具有特定形状的数据库时存在困难。为了克服这一问题,本文提出了CURE- ns (CURE with new shrink scheme)算法,利用代表点的密度值差异来确定收缩的方向和距离。我们的收缩方案与星团的形状无关。一系列实验表明,CURE- ns具有比CURE更好的聚类性能。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
求助全文
约1分钟内获得全文 求助全文
来源期刊
自引率
0.00%
发文量
0
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
确定
请完成安全验证×
copy
已复制链接
快去分享给好友吧!
我知道了
右上角分享
点击右上角分享
0
联系我们:info@booksci.cn Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。 Copyright © 2023 布克学术 All rights reserved.
京ICP备2023020795号-1
ghs 京公网安备 11010802042870号
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术官方微信