聚类问题的通用算法

IF 1.4 3区计算机科学 Q3 COMPUTER SCIENCE, THEORY & METHODS

ACM Transactions on Algorithms Pub Date : 2021-05-05 DOI:10.1145/3572840

Arun Ganesh, B. Maggs, Debmalya Panigrahi

{"title":"聚类问题的通用算法","authors":"Arun Ganesh, B. Maggs, Debmalya Panigrahi","doi":"10.1145/3572840","DOIUrl":null,"url":null,"abstract":"This article presents universal algorithms for clustering problems, including the widely studied k-median, k-means, and k-center objectives. The input is a metric space containing all potential client locations. The algorithm must select k cluster centers such that they are a good solution for any subset of clients that actually realize. Specifically, we aim for low regret, defined as the maximum over all subsets of the difference between the cost of the algorithm’s solution and that of an optimal solution. A universal algorithm’s solution Sol for a clustering problem is said to be an α , β-approximation if for all subsets of clients C′, it satisfies sol (C′) ≤ α ċ opt (C′) + β ċ mr, where opt (C′ is the cost of the optimal solution for clients (C′) and mr is the minimum regret achievable by any solution. Our main results are universal algorithms for the standard clustering objectives of k-median, k-means, and k-center that achieve (O(1), O(1))-approximations. These results are obtained via a novel framework for universal algorithms using linear programming (LP) relaxations. These results generalize to other ℓp-objectives and the setting where some subset of the clients are fixed. We also give hardness results showing that (α, β)-approximation is NP-hard if α or β is at most a certain constant, even for the widely studied special case of Euclidean metric spaces. This shows that in some sense, (O(1), O(1))-approximation is the strongest type of guarantee obtainable for universal clustering.","PeriodicalId":50922,"journal":{"name":"ACM Transactions on Algorithms","volume":"19 1","pages":"1 - 46"},"PeriodicalIF":1.4000,"publicationDate":"2021-05-05","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"2","resultStr":"{\"title\":\"Universal Algorithms for Clustering Problems\",\"authors\":\"Arun Ganesh, B. Maggs, Debmalya Panigrahi\",\"doi\":\"10.1145/3572840\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"This article presents universal algorithms for clustering problems, including the widely studied k-median, k-means, and k-center objectives. The input is a metric space containing all potential client locations. The algorithm must select k cluster centers such that they are a good solution for any subset of clients that actually realize. Specifically, we aim for low regret, defined as the maximum over all subsets of the difference between the cost of the algorithm’s solution and that of an optimal solution. A universal algorithm’s solution Sol for a clustering problem is said to be an α , β-approximation if for all subsets of clients C′, it satisfies sol (C′) ≤ α ċ opt (C′) + β ċ mr, where opt (C′ is the cost of the optimal solution for clients (C′) and mr is the minimum regret achievable by any solution. Our main results are universal algorithms for the standard clustering objectives of k-median, k-means, and k-center that achieve (O(1), O(1))-approximations. These results are obtained via a novel framework for universal algorithms using linear programming (LP) relaxations. These results generalize to other ℓp-objectives and the setting where some subset of the clients are fixed. We also give hardness results showing that (α, β)-approximation is NP-hard if α or β is at most a certain constant, even for the widely studied special case of Euclidean metric spaces. This shows that in some sense, (O(1), O(1))-approximation is the strongest type of guarantee obtainable for universal clustering.\",\"PeriodicalId\":50922,\"journal\":{\"name\":\"ACM Transactions on Algorithms\",\"volume\":\"19 1\",\"pages\":\"1 - 46\"},\"PeriodicalIF\":1.4000,\"publicationDate\":\"2021-05-05\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"2\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"ACM Transactions on Algorithms\",\"FirstCategoryId\":\"94\",\"ListUrlMain\":\"https://doi.org/10.1145/3572840\",\"RegionNum\":3,\"RegionCategory\":\"计算机科学\",\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"Q3\",\"JCRName\":\"COMPUTER SCIENCE, THEORY & METHODS\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"ACM Transactions on Algorithms","FirstCategoryId":"94","ListUrlMain":"https://doi.org/10.1145/3572840","RegionNum":3,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q3","JCRName":"COMPUTER SCIENCE, THEORY & METHODS","Score":null,"Total":0}

引用次数: 2

摘要

本文提出了聚类问题的通用算法，包括广泛研究的k-中值、k-均值和k-中心目标。输入是包含所有潜在客户端位置的度量空间。该算法必须选择k个集群中心，以便它们对于实际实现的任何客户端子集都是一个很好的解决方案。具体来说，我们的目标是低遗憾，定义为算法解的成本与最优解的成本之间的差异在所有子集上的最大值。聚类问题的一个通用算法的解Sol被称为α，β-近似，如果对于客户端C′的所有子集，它满足Sol（C′）≤αopt（C’）+βmr，其中opt（C′）是客户端最佳解的代价，mr是任何解可实现的最小遗憾。我们的主要结果是实现（O（1），O（1。这些结果是通过使用线性规划（LP）松弛的通用算法的新框架获得的。这些结果推广到其他ℓp目标和固定客户端的某个子集的设置。我们还给出了硬度结果，表明如果α或β至多是某个常数，即使对于广泛研究的欧氏度量空间的特殊情况，（α，β）-近似也是NP难的。这表明，在某种意义上，（O（1），O（1。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

查看原文本刊更多论文

Universal Algorithms for Clustering Problems

This article presents universal algorithms for clustering problems, including the widely studied k-median, k-means, and k-center objectives. The input is a metric space containing all potential client locations. The algorithm must select k cluster centers such that they are a good solution for any subset of clients that actually realize. Specifically, we aim for low regret, defined as the maximum over all subsets of the difference between the cost of the algorithm’s solution and that of an optimal solution. A universal algorithm’s solution Sol for a clustering problem is said to be an α , β-approximation if for all subsets of clients C′, it satisfies sol (C′) ≤ α ċ opt (C′) + β ċ mr, where opt (C′ is the cost of the optimal solution for clients (C′) and mr is the minimum regret achievable by any solution. Our main results are universal algorithms for the standard clustering objectives of k-median, k-means, and k-center that achieve (O(1), O(1))-approximations. These results are obtained via a novel framework for universal algorithms using linear programming (LP) relaxations. These results generalize to other ℓp-objectives and the setting where some subset of the clients are fixed. We also give hardness results showing that (α, β)-approximation is NP-hard if α or β is at most a certain constant, even for the widely studied special case of Euclidean metric spaces. This shows that in some sense, (O(1), O(1))-approximation is the strongest type of guarantee obtainable for universal clustering.

求助全文

通过发布文献求助，成功后即可免费获取论文全文。去求助

来源期刊

ACM Transactions on Algorithms COMPUTER SCIENCE, THEORY & METHODS-MATHEMATICS, APPLIED

CiteScore

3.30

自引率

0.00%

发文量

审稿时长

6-12 weeks

期刊介绍： ACM Transactions on Algorithms welcomes submissions of original research of the highest quality dealing with algorithms that are inherently discrete and finite, and having mathematical content in a natural way, either in the objective or in the analysis. Most welcome are new algorithms and data structures, new and improved analyses, and complexity results. Specific areas of computation covered by the journal include combinatorial searches and objects; counting; discrete optimization and approximation; randomization and quantum computation; parallel and distributed computation; algorithms for graphs, geometry, arithmetic, number theory, strings; on-line analysis; cryptography; coding; data compression; learning algorithms; methods of algorithmic analysis; discrete algorithms for application areas such as biology, economics, game theory, communication, computer systems and architecture, hardware design, scientific computing