Diptarka Chakraborty, Debarati Das, Robert Krauthgamer
{"title":"Approximate Trace Reconstruction via Median String (in Average-Case)","authors":"Diptarka Chakraborty, Debarati Das, Robert Krauthgamer","doi":"10.4230/LIPIcs.FSTTCS.2021.11","DOIUrl":null,"url":null,"abstract":"We consider an \\emph{approximate} version of the trace reconstruction problem, where the goal is to recover an unknown string $s\\in\\{0,1\\}^n$ from $m$ traces (each trace is generated independently by passing $s$ through a probabilistic insertion-deletion channel with rate $p$). We present a deterministic near-linear time algorithm for the average-case model, where $s$ is random, that uses only \\emph{three} traces. It runs in near-linear time $\\tilde O(n)$ and with high probability reports a string within edit distance $O(\\epsilon p n)$ from $s$ for $\\epsilon=\\tilde O(p)$, which significantly improves over the straightforward bound of $O(pn)$. Technically, our algorithm computes a $(1+\\epsilon)$-approximate median of the three input traces. To prove its correctness, our probabilistic analysis shows that an approximate median is indeed close to the unknown $s$. To achieve a near-linear time bound, we have to bypass the well-known dynamic programming algorithm that computes an optimal median in time $O(n^3)$.","PeriodicalId":175000,"journal":{"name":"Foundations of Software Technology and Theoretical Computer Science","volume":"34 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2021-07-20","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"4","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Foundations of Software Technology and Theoretical Computer Science","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.4230/LIPIcs.FSTTCS.2021.11","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 4
Abstract
We consider an \emph{approximate} version of the trace reconstruction problem, where the goal is to recover an unknown string $s\in\{0,1\}^n$ from $m$ traces (each trace is generated independently by passing $s$ through a probabilistic insertion-deletion channel with rate $p$). We present a deterministic near-linear time algorithm for the average-case model, where $s$ is random, that uses only \emph{three} traces. It runs in near-linear time $\tilde O(n)$ and with high probability reports a string within edit distance $O(\epsilon p n)$ from $s$ for $\epsilon=\tilde O(p)$, which significantly improves over the straightforward bound of $O(pn)$. Technically, our algorithm computes a $(1+\epsilon)$-approximate median of the three input traces. To prove its correctness, our probabilistic analysis shows that an approximate median is indeed close to the unknown $s$. To achieve a near-linear time bound, we have to bypass the well-known dynamic programming algorithm that computes an optimal median in time $O(n^3)$.