Optimal mean-based algorithms for trace reconstruction

Anindya De, R. O'Donnell, R. Servedio
{"title":"Optimal mean-based algorithms for trace reconstruction","authors":"Anindya De, R. O'Donnell, R. Servedio","doi":"10.1145/3055399.3055450","DOIUrl":null,"url":null,"abstract":"In the (deletion-channel) trace reconstruction problem, there is an unknown n-bit source string x. An algorithm is given access to independent traces of x, where a trace is formed by deleting each bit of x independently with probability δ. The goal of the algorithm is to recover x exactly (with high probability), while minimizing samples (number of traces) and running time. Previously, the best known algorithm for the trace reconstruction problem was due to Holenstein et al. [SODA 2008]; it uses exp(O(n1/2)) samples and running time for any fixed 0 < δ < 1. It is also what we call a \"mean-based algorithm\", meaning that it only uses the empirical means of the individual bits of the traces. Holenstein et al. also gave a lower bound, showing that any mean-based algorithm must use at least nΩ(logn) samples. In this paper we improve both of these results, obtaining matching upper and lower bounds for mean-based trace reconstruction. For any constant deletion rate 0 < Ω < 1, we give a mean-based algorithm that uses exp(O(n1/3)) time and traces; we also prove that any mean-based algorithm must use at least exp(Ω(n1/3)) traces. In fact, we obtain matching upper and lower bounds even for Ω subconstant and ρ := 1 - Ω subconstant: when (log3 n)/n ≪ Ω ≤ 1/2 the bound is exp(-Θ(δδ n)1/3), and when 1/√n ≪ ρ ≥ 1/2 the bound is exp(-Θ(n/Θ)1/3). Our proofs involve estimates for the maxima of Littlewood polynomials on complex disks. We show that these techniques can also be used to perform trace reconstruction with random insertions and bit-flips in addition to deletions. We also find a surprising result: for deletion probabilities δ > 1/2, the presence of insertions can actually help with trace reconstruction.","PeriodicalId":20615,"journal":{"name":"Proceedings of the 49th Annual ACM SIGACT Symposium on Theory of Computing","volume":"14 1","pages":""},"PeriodicalIF":0.0000,"publicationDate":"2016-12-09","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"58","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Proceedings of the 49th Annual ACM SIGACT Symposium on Theory of Computing","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1145/3055399.3055450","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 58

Abstract

In the (deletion-channel) trace reconstruction problem, there is an unknown n-bit source string x. An algorithm is given access to independent traces of x, where a trace is formed by deleting each bit of x independently with probability δ. The goal of the algorithm is to recover x exactly (with high probability), while minimizing samples (number of traces) and running time. Previously, the best known algorithm for the trace reconstruction problem was due to Holenstein et al. [SODA 2008]; it uses exp(O(n1/2)) samples and running time for any fixed 0 < δ < 1. It is also what we call a "mean-based algorithm", meaning that it only uses the empirical means of the individual bits of the traces. Holenstein et al. also gave a lower bound, showing that any mean-based algorithm must use at least nΩ(logn) samples. In this paper we improve both of these results, obtaining matching upper and lower bounds for mean-based trace reconstruction. For any constant deletion rate 0 < Ω < 1, we give a mean-based algorithm that uses exp(O(n1/3)) time and traces; we also prove that any mean-based algorithm must use at least exp(Ω(n1/3)) traces. In fact, we obtain matching upper and lower bounds even for Ω subconstant and ρ := 1 - Ω subconstant: when (log3 n)/n ≪ Ω ≤ 1/2 the bound is exp(-Θ(δδ n)1/3), and when 1/√n ≪ ρ ≥ 1/2 the bound is exp(-Θ(n/Θ)1/3). Our proofs involve estimates for the maxima of Littlewood polynomials on complex disks. We show that these techniques can also be used to perform trace reconstruction with random insertions and bit-flips in addition to deletions. We also find a surprising result: for deletion probabilities δ > 1/2, the presence of insertions can actually help with trace reconstruction.
基于均值的最优轨迹重建算法
在(删除通道)迹重建问题中,存在一个未知的n位源字符串x。给出了一种算法来访问x的独立迹,其中通过以概率δ独立地删除x的每个位来形成迹。该算法的目标是精确地(高概率地)恢复x,同时最小化样本(跟踪数)和运行时间。此前,最著名的轨迹重建算法是Holenstein等人提出的[SODA 2008];对于任意固定的0 < δ < 1,它使用exp(O(n1/2))样本和运行时间。这也是我们所说的“基于均值的算法”,意思是它只使用轨迹中单个比特的经验均值。Holenstein等人也给出了一个下界,表明任何基于均值的算法必须使用至少nΩ(logn)个样本。本文改进了这两个结果,得到了基于均值的轨迹重建的匹配上界和下界。对于任意恒定的删除率0 < Ω < 1,我们给出了一个基于均值的算法,该算法使用exp(O(n1/3))时间和轨迹;我们还证明了任何基于均值的算法必须至少使用exp(Ω(n1/3))条轨迹。事实上,即使对于Ω亚常数和ρ:= 1 - Ω亚常数,我们也能得到匹配的上界和下界:当(log3n)/n≪Ω≤1/2时,界为exp(-Θ(δδ n)1/3),当1/√n≪ρ≥1/2时,界为exp(-Θ(n/Θ)1/3)。我们的证明涉及对复盘上利特伍德多项式的最大值的估计。我们表明,这些技术也可以用于执行随机插入和位翻转的跟踪重建,除了删除。我们还发现了一个令人惊讶的结果:对于缺失概率δ > 1/2,插入的存在实际上有助于痕迹重建。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
求助全文
约1分钟内获得全文 求助全文
来源期刊
自引率
0.00%
发文量
0
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
确定
请完成安全验证×
copy
已复制链接
快去分享给好友吧!
我知道了
右上角分享
点击右上角分享
0
联系我们:info@booksci.cn Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。 Copyright © 2023 布克学术 All rights reserved.
京ICP备2023020795号-1
ghs 京公网安备 11010802042870号
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术官方微信