Upper and Lower Bounds for Complete Linkage in General Metric Spaces

IF 1.3 4区 物理与天体物理 Q4 PHYSICS, APPLIED
Anna Arutyunova, A. Großwendt, Heiko Röglin, Melanie Schmidt, Julian Wargalla
{"title":"Upper and Lower Bounds for Complete Linkage in General Metric Spaces","authors":"Anna Arutyunova, A. Großwendt, Heiko Röglin, Melanie Schmidt, Julian Wargalla","doi":"10.4230/LIPIcs.APPROX/RANDOM.2021.18","DOIUrl":null,"url":null,"abstract":"In a hierarchical clustering problem the task is to compute a series of mutually compatible clusterings of a finite metric space $$(P,{{\\,\\textrm{dist}\\,}})$$ ( P , dist ) . Starting with the clustering where every point forms its own cluster, one iteratively merges two clusters until only one cluster remains. Complete linkage is a well-known and popular algorithm to compute such clusterings: in every step it merges the two clusters whose union has the smallest radius (or diameter) among all currently possible merges. We prove that the radius (or diameter) of every k -clustering computed by complete linkage is at most by factor O ( k ) (or $$O(k^{\\ln (3)/\\ln (2)})=O(k^{1{.}59})$$ O ( k ln ( 3 ) / ln ( 2 ) ) = O ( k 1.59 ) ) worse than an optimal k -clustering minimizing the radius (or diameter). Furthermore we give a negative answer to the question proposed by Dasgupta and Long (J Comput Syst Sci 70(4):555–569, 2005. https://doi.org/10.1016/j.jcss.2004.10.006 ), who show a lower bound of $$\\Omega (\\log (k))$$ Ω ( log ( k ) ) and ask if the approximation guarantee is in fact $$\\Theta (\\log (k))$$ Θ ( log ( k ) ) . We present instances where complete linkage performs poorly in the sense that the k -clustering computed by complete linkage is off by a factor of $$\\Omega (k)$$ Ω ( k ) from an optimal solution for radius and diameter. We conclude that in general metric spaces complete linkage does not perform asymptotically better than single linkage, merging the two clusters with smallest inter-cluster distance, for which we prove an approximation guarantee of O ( k ).","PeriodicalId":54319,"journal":{"name":"Spin","volume":"27 1","pages":"489-518"},"PeriodicalIF":1.3000,"publicationDate":"2023-11-30","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"1","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Spin","FirstCategoryId":"101","ListUrlMain":"https://doi.org/10.4230/LIPIcs.APPROX/RANDOM.2021.18","RegionNum":4,"RegionCategory":"物理与天体物理","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q4","JCRName":"PHYSICS, APPLIED","Score":null,"Total":0}
引用次数: 1

Abstract

In a hierarchical clustering problem the task is to compute a series of mutually compatible clusterings of a finite metric space $$(P,{{\,\textrm{dist}\,}})$$ ( P , dist ) . Starting with the clustering where every point forms its own cluster, one iteratively merges two clusters until only one cluster remains. Complete linkage is a well-known and popular algorithm to compute such clusterings: in every step it merges the two clusters whose union has the smallest radius (or diameter) among all currently possible merges. We prove that the radius (or diameter) of every k -clustering computed by complete linkage is at most by factor O ( k ) (or $$O(k^{\ln (3)/\ln (2)})=O(k^{1{.}59})$$ O ( k ln ( 3 ) / ln ( 2 ) ) = O ( k 1.59 ) ) worse than an optimal k -clustering minimizing the radius (or diameter). Furthermore we give a negative answer to the question proposed by Dasgupta and Long (J Comput Syst Sci 70(4):555–569, 2005. https://doi.org/10.1016/j.jcss.2004.10.006 ), who show a lower bound of $$\Omega (\log (k))$$ Ω ( log ( k ) ) and ask if the approximation guarantee is in fact $$\Theta (\log (k))$$ Θ ( log ( k ) ) . We present instances where complete linkage performs poorly in the sense that the k -clustering computed by complete linkage is off by a factor of $$\Omega (k)$$ Ω ( k ) from an optimal solution for radius and diameter. We conclude that in general metric spaces complete linkage does not perform asymptotically better than single linkage, merging the two clusters with smallest inter-cluster distance, for which we prove an approximation guarantee of O ( k ).
广义度量空间中完全连杆的上界和下界
在层次聚类问题中,任务是计算有限度量空间(P, dist)的一系列相互兼容的聚类。从每个点形成自己的簇的聚类开始,迭代地合并两个簇,直到只剩下一个簇。完全链接是计算这类聚类的一种著名且流行的算法:在每一步中,它合并所有当前可能合并的聚类中,其并集的半径(或直径)最小的两个聚类。我们证明了由完全连杆计算的每个k-聚类的半径(或直径)最多比最小化半径(或直径)的最优k-聚类差O(k)(或O(k2))。此外,我们对Dasgupta和Long[6]提出的问题给出了否定的答案,他们给出了Ω(log(k))的下界,并询问近似保证是否实际上是Θ(log(k))。我们提出了完全链接表现不佳的实例,因为由完全链接计算的k-聚类与半径和直径的最优解相差Ω(k)。我们得出结论,在一般度量空间中,完全连杆并不比单连杆具有更好的渐近性能,以最小的簇间距离合并两个簇,为此我们证明了O(k)的近似保证。2012 ACM学科分类计算理论→设施定位与聚类
本文章由计算机程序翻译,如有差异,请以英文原文为准。
求助全文
约1分钟内获得全文 求助全文
来源期刊
Spin
Spin Materials Science-Electronic, Optical and Magnetic Materials
CiteScore
2.10
自引率
11.10%
发文量
34
期刊介绍: Spin electronics encompasses a multidisciplinary research effort involving magnetism, semiconductor electronics, materials science, chemistry and biology. SPIN aims to provide a forum for the presentation of research and review articles of interest to all researchers in the field. The scope of the journal includes (but is not necessarily limited to) the following topics: *Materials: -Metals -Heusler compounds -Complex oxides: antiferromagnetic, ferromagnetic -Dilute magnetic semiconductors -Dilute magnetic oxides -High performance and emerging magnetic materials *Semiconductor electronics *Nanodevices: -Fabrication -Characterization *Spin injection *Spin transport *Spin transfer torque *Spin torque oscillators *Electrical control of magnetic properties *Organic spintronics *Optical phenomena and optoelectronic spin manipulation *Applications and devices: -Novel memories and logic devices -Lab-on-a-chip -Others *Fundamental and interdisciplinary studies: -Spin in low dimensional system -Spin in medical sciences -Spin in other fields -Computational materials discovery
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
确定
请完成安全验证×
copy
已复制链接
快去分享给好友吧!
我知道了
右上角分享
点击右上角分享
0
联系我们:info@booksci.cn Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。 Copyright © 2023 布克学术 All rights reserved.
京ICP备2023020795号-1
ghs 京公网安备 11010802042870号
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术官方微信