{"title":"Number-On-Forehead Communication Complexity of Data Clustering with Sunflowers","authors":"Fabricio Mendoza-Granada, Marcos Villagra","doi":"10.5753/etc.2019.6394","DOIUrl":null,"url":null,"abstract":"We study the problem of performing data clustering in a distributed setting, which is a problem that may arise in many practical areas such as machine learning and data analysis. The way in which the sites communicate and the way data is allocated define a model of communication. We develop a protocol to compute distributed clustering in the Number on Forehead model of communication complexity. In our model, we requiere that each site is aware of all clusters in its own data and all data allocated among sites define a sunflower. We show that there exists a two round communication protocol for data clustering where each site knows an to all clusters.","PeriodicalId":315906,"journal":{"name":"Anais do Encontro de Teoria da Computação (ETC)","volume":"23 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2019-07-02","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Anais do Encontro de Teoria da Computação (ETC)","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.5753/etc.2019.6394","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 0
Abstract
We study the problem of performing data clustering in a distributed setting, which is a problem that may arise in many practical areas such as machine learning and data analysis. The way in which the sites communicate and the way data is allocated define a model of communication. We develop a protocol to compute distributed clustering in the Number on Forehead model of communication complexity. In our model, we requiere that each site is aware of all clusters in its own data and all data allocated among sites define a sunflower. We show that there exists a two round communication protocol for data clustering where each site knows an to all clusters.