Robust Max Selection

arXiv - CS - Data Structures and Algorithms Pub Date : 2024-09-09 DOI:arxiv-2409.06014

Trung Dang, Zhiyi Huang

{"title":"Robust Max Selection","authors":"Trung Dang, Zhiyi Huang","doi":"arxiv-2409.06014","DOIUrl":null,"url":null,"abstract":"We introduce a new model to study algorithm design under unreliable\ninformation, and apply this model for the problem of finding the uncorrupted\nmaximum element of a list containing $n$ elements, among which are $k$\ncorrupted elements. Under our model, algorithms can perform black-box\ncomparison queries between any pair of elements. However, queries regarding\ncorrupted elements may have arbitrary output. In particular, corrupted elements\ndo not need to behave as any consistent values, and may introduce cycles in the\nelements' ordering. This imposes new challenges for designing correct\nalgorithms under this setting. For example, one cannot simply output a single\nelement, as it is impossible to distinguish elements of a list containing one\ncorrupted and one uncorrupted element. To ensure correctness, algorithms under\nthis setting must output a set to make sure the uncorrupted maximum element is\nincluded. We first show that any algorithm must output a set of size at least $\\min\\{n,\n2k + 1\\}$ to ensure that the uncorrupted maximum is contained in the output\nset. Restricted to algorithms whose output size is exactly $\\min\\{n, 2k + 1\\}$,\nfor deterministic algorithms, we show matching upper and lower bounds of\n$\\Theta(nk)$ comparison queries to produce a set of elements that contains the\nuncorrupted maximum. On the randomized side, we propose a 2-stage algorithm\nthat, with high probability, uses $O(n + k \\operatorname{polylog} k)$\ncomparison queries to find such a set, almost matching the $\\Omega(n)$ queries\nnecessary for any randomized algorithm to obtain a constant probability of\nbeing correct.","PeriodicalId":501525,"journal":{"name":"arXiv - CS - Data Structures and Algorithms","volume":"12 1","pages":""},"PeriodicalIF":0.0000,"publicationDate":"2024-09-09","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"arXiv - CS - Data Structures and Algorithms","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/arxiv-2409.06014","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}

引用次数: 0

Abstract

We introduce a new model to study algorithm design under unreliable information, and apply this model for the problem of finding the uncorrupted maximum element of a list containing $n$ elements, among which are $k$ corrupted elements. Under our model, algorithms can perform black-box comparison queries between any pair of elements. However, queries regarding corrupted elements may have arbitrary output. In particular, corrupted elements do not need to behave as any consistent values, and may introduce cycles in the elements' ordering. This imposes new challenges for designing correct algorithms under this setting. For example, one cannot simply output a single element, as it is impossible to distinguish elements of a list containing one corrupted and one uncorrupted element. To ensure correctness, algorithms under this setting must output a set to make sure the uncorrupted maximum element is included. We first show that any algorithm must output a set of size at least $\min\{n, 2k + 1\}$ to ensure that the uncorrupted maximum is contained in the output set. Restricted to algorithms whose output size is exactly $\min\{n, 2k + 1\}$, for deterministic algorithms, we show matching upper and lower bounds of $\Theta(nk)$ comparison queries to produce a set of elements that contains the uncorrupted maximum. On the randomized side, we propose a 2-stage algorithm that, with high probability, uses $O(n + k \operatorname{polylog} k)$ comparison queries to find such a set, almost matching the $\Omega(n)$ queries necessary for any randomized algorithm to obtain a constant probability of being correct.

查看原文本刊更多论文

稳健的最大值选择

我们引入了一个新模型来研究不可靠信息下的算法设计，并将该模型应用于寻找包含 $n$ 元素（其中有 $k$ 被破坏元素）的列表中未被破坏的最大元素的问题。在我们的模型下，算法可以在任意一对元素之间执行黑盒比较查询。但是，有关损坏元素的查询可能有任意输出。特别是，被破坏的元素不需要表现为任何一致的值，而且可能会在元素排序中引入循环。这给在这种情况下设计正确算法带来了新的挑战。例如，我们不能简单地输出单个元素，因为不可能区分包含一个损坏元素和一个未损坏元素的列表元素。为了确保正确性，在这种设置下的算法必须输出一个集合，以确保未损坏的最大元素被包含在内。我们首先证明，任何算法都必须输出一个大小至少为 $\min\{n,2k + 1\}$ 的集合，以确保输出集合中包含未被破坏的最大值。限于输出大小正好是 $\min\{n, 2k + 1\}$ 的算法，对于确定性算法，我们展示了$theta(nk)$ 比较查询的匹配上界和下界，以产生包含未破坏最大值的元素集。在随机算法方面，我们提出了一种两阶段算法，该算法很有可能使用 $O(n + k \operatorname{polylog} k)$ 比较查询来找到这样一个集合，几乎与任何随机算法获得恒定正确概率所需的 $\Omega(n)$ 查询相匹配。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

求助全文

约1分钟内获得全文求助全文

来源期刊

arXiv - CS - Data Structures and Algorithms

自引率

0.00%

发文量