Akanksha Agrawal, Tanmay Inamdar, Saket Saurabh, Jie Xue
{"title":"Clustering what Matters: Optimal Approximation for Clustering with Outliers","authors":"Akanksha Agrawal, Tanmay Inamdar, Saket Saurabh, Jie Xue","doi":"10.1613/jair.1.14883","DOIUrl":null,"url":null,"abstract":"Clustering with outliers is one of the most fundamental problems in Computer Science. Given a set X of n points and two numbers k, m, the clustering with outliers aims to exclude m points from X and partition the remaining points into k clusters that minimizes a certain cost function. In this paper, we give a general approach for solving clustering with outliers, which results in a fixed-parameter tractable (FPT) algorithm in k and m—i.e., an algorithm with running time of the form f(k, m) · nO(1) for some function f—that almost matches the approximation ratio for its outlier-free counterpart. As a corollary, we obtain FPT approximation algorithms with optimal approximation ratios for k-Median and k-Means with outliers in general and Euclidean metrics. We also exhibit more applications of our approach to other variants of the problem that impose additional constraints on the clustering, such as fairness or matroid constraints.","PeriodicalId":54877,"journal":{"name":"Journal of Artificial Intelligence Research","volume":null,"pages":null},"PeriodicalIF":4.5000,"publicationDate":"2023-09-15","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Journal of Artificial Intelligence Research","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1613/jair.1.14883","RegionNum":3,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q2","JCRName":"COMPUTER SCIENCE, ARTIFICIAL INTELLIGENCE","Score":null,"Total":0}
引用次数: 0
Abstract
Clustering with outliers is one of the most fundamental problems in Computer Science. Given a set X of n points and two numbers k, m, the clustering with outliers aims to exclude m points from X and partition the remaining points into k clusters that minimizes a certain cost function. In this paper, we give a general approach for solving clustering with outliers, which results in a fixed-parameter tractable (FPT) algorithm in k and m—i.e., an algorithm with running time of the form f(k, m) · nO(1) for some function f—that almost matches the approximation ratio for its outlier-free counterpart. As a corollary, we obtain FPT approximation algorithms with optimal approximation ratios for k-Median and k-Means with outliers in general and Euclidean metrics. We also exhibit more applications of our approach to other variants of the problem that impose additional constraints on the clustering, such as fairness or matroid constraints.
期刊介绍:
JAIR(ISSN 1076 - 9757) covers all areas of artificial intelligence (AI), publishing refereed research articles, survey articles, and technical notes. Established in 1993 as one of the first electronic scientific journals, JAIR is indexed by INSPEC, Science Citation Index, and MathSciNet. JAIR reviews papers within approximately three months of submission and publishes accepted articles on the internet immediately upon receiving the final versions. JAIR articles are published for free distribution on the internet by the AI Access Foundation, and for purchase in bound volumes by AAAI Press.