Can Distributed Uniformity Testing Be Local?

Proceedings of the 2019 ACM Symposium on Principles of Distributed Computing Pub Date : 2019-07-16 DOI:10.1145/3293611.3331613

Uri Meir, Dor Minzer, R. Oshman

{"title":"Can Distributed Uniformity Testing Be Local?","authors":"Uri Meir, Dor Minzer, R. Oshman","doi":"10.1145/3293611.3331613","DOIUrl":null,"url":null,"abstract":"In the distributed uniformity testing problem, k servers draw samples from some unknown distribution, and the goal is to determine whether the unknown distribution is uniform or whether it is ε-far from uniform, where ε is a proximity parameter. Each server decides whether to accept or reject, and these decisions are sent to a referee, who makes a final decision based on the servers' local decisions. Uniformity testing is a particularly useful building-block, because it is complete for the problem of testing identity to any fixed distribution. It was recently shown that distributing the task of uniformity testing allows each server to draw fewer samples than are needed in the centralized case, but so far the number of samples required for distributed uniformity testing has not been well understood. In this paper we settle this question, and also investigate the cost of using local decision rules, such as rejecting iff at least one server wants to reject (the usual decision rule used in local distributed decision). To answer these questions, we develop a new Fourier-based technique for proving lower bounds on the sample complexity of distribution testing, which lends itself particularly well to the distributed case. Using our technique, we tightly characterize the number of samples required for uniformity testing when the referee can apply any decision function to the servers' local decisions. We also show that if the network rejects whenever one server wants to reject, then the cost of uniformity testing is much higher, and in fact we do not gain compared to the centralized case unless the number of servers is exponential in Ω (1/ε). Finally, we apply our lower bound technique to the case where the referee applies a threshold decision rule, and also generalize a lower bound from[1] for learning an unknown input distribution.","PeriodicalId":153766,"journal":{"name":"Proceedings of the 2019 ACM Symposium on Principles of Distributed Computing","volume":"238 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2019-07-16","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"3","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Proceedings of the 2019 ACM Symposium on Principles of Distributed Computing","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1145/3293611.3331613","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}

引用次数: 3

Abstract

In the distributed uniformity testing problem, k servers draw samples from some unknown distribution, and the goal is to determine whether the unknown distribution is uniform or whether it is ε-far from uniform, where ε is a proximity parameter. Each server decides whether to accept or reject, and these decisions are sent to a referee, who makes a final decision based on the servers' local decisions. Uniformity testing is a particularly useful building-block, because it is complete for the problem of testing identity to any fixed distribution. It was recently shown that distributing the task of uniformity testing allows each server to draw fewer samples than are needed in the centralized case, but so far the number of samples required for distributed uniformity testing has not been well understood. In this paper we settle this question, and also investigate the cost of using local decision rules, such as rejecting iff at least one server wants to reject (the usual decision rule used in local distributed decision). To answer these questions, we develop a new Fourier-based technique for proving lower bounds on the sample complexity of distribution testing, which lends itself particularly well to the distributed case. Using our technique, we tightly characterize the number of samples required for uniformity testing when the referee can apply any decision function to the servers' local decisions. We also show that if the network rejects whenever one server wants to reject, then the cost of uniformity testing is much higher, and in fact we do not gain compared to the centralized case unless the number of servers is exponential in Ω (1/ε). Finally, we apply our lower bound technique to the case where the referee applies a threshold decision rule, and also generalize a lower bound from[1] for learning an unknown input distribution.

查看原文本刊更多论文

分布均匀性测试可以局部化吗?

在分布均匀性测试问题中，k个服务器从某个未知分布中抽取样本，目标是确定未知分布是均匀的还是ε-远不均匀的，其中ε是接近性参数。每个服务器决定是接受还是拒绝，这些决定被发送给裁判，裁判根据服务器的本地决定做出最终决定。均匀性测试是一个特别有用的组成部分，因为它对于测试任何固定分布的同一性问题是完整的。最近的研究表明，分配均匀性测试的任务允许每个服务器绘制比集中式情况下所需的更少的样本，但到目前为止，分布式均匀性测试所需的样本数量尚未得到很好的理解。在本文中，我们解决了这个问题，并研究了使用本地决策规则的成本，例如，如果至少有一个服务器想要拒绝(本地分布式决策中常用的决策规则)，则拒绝。为了回答这些问题，我们开发了一种新的基于傅立叶的技术来证明分布测试的样本复杂性的下界，这种技术特别适合于分布情况。使用我们的技术，当裁判可以将任何决策函数应用于服务器的本地决策时，我们严格地描述了一致性测试所需的样本数量。我们还表明，如果网络拒绝任何一台服务器想要拒绝，那么一致性测试的成本要高得多，事实上，与集中式情况相比，我们没有获得任何收益，除非服务器数量呈指数级增长Ω (1/ε)。最后，我们将下界技术应用于裁判应用阈值决策规则的情况，并从[1]推广下界来学习未知的输入分布。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

求助全文

约1分钟内获得全文求助全文

来源期刊

Proceedings of the 2019 ACM Symposium on Principles of Distributed Computing

自引率

0.00%

发文量