{"title":"字符串同步集、最长公共子串和(k\\)-错配匹配的量子加速","authors":"Ce Jin, Jakob Nogler","doi":"10.1145/3672395","DOIUrl":null,"url":null,"abstract":"<p><i>Longest Common Substring (LCS)</i> is an important text processing problem, which has recently been investigated in the quantum query model. The decision version of this problem, <i>LCS with threshold \\(d\\)</i>, asks whether two length-\\(n\\) input strings have a common substring of length \\(d\\). The two extreme cases, \\(d=1\\) and \\(d=n\\), correspond respectively to Element Distinctness and Unstructured Search, two fundamental problems in quantum query complexity. However, the intermediate case \\(1\\ll d\\ll n\\) was not fully understood.</p><p>We show that the complexity of LCS with threshold \\(d\\) smoothly interpolates between the two extreme cases up to \\(n^{o(1)}\\) factors:\n<p><ul><li><p>LCS with threshold \\(d\\) has a quantum algorithm in \\(n^{2/3+o(1)}/d^{1/6}\\) query complexity and time complexity, and requires at least \\(\\Omega(n^{2/3}/d^{1/6})\\) quantum query complexity.</p></li></ul></p></p><p>Our result improves upon previous upper bounds \\(\\tilde{O}(\\min\\{n/d^{1/2},n^{2/3}\\})\\) (Le Gall and Seddighin ITCS 2022, Akmal and Jin SODA 2022), and answers an open question of Akmal and Jin.</p><p>Our main technical contribution is a quantum speed-up of the powerful <i>String Synchronizing Set</i> technique introduced by Kempa and Kociumaka (STOC 2019). It consistently samples \\(n/\\tau^{1-o(1)}\\) synchronizing positions in the string depending on their length-\\(\\Theta(\\tau)\\) contexts, and each synchronizing position can be reported by a quantum algorithm in \\(\\tilde{O}(\\tau^{1/2+o(1)})\\) time. Our quantum string synchronizing set also yields a near-optimal LCE data structure in the quantum setting.</p><p>As another application of our quantum string synchronizing set, we study the <i>\\(k\\)-mismatch Matching</i> problem, which asks if the pattern has an occurrence in the text with at most \\(k\\) Hamming mismatches. Using a structural result of Charalampopoulos, Kociumaka, and Wellnitz (FOCS 2020), we obtain:\n<p><ul><li><p>\\(k\\)-mismatch matching has a quantum algorithm with \\(k^{3/4}n^{1/2+o(1)}\\) query complexity and \\(\\tilde{O}(kn^{1/2})\\) time complexity. We also observe a non-matching quantum query lower bound of \\(\\Omega(\\sqrt{kn})\\).</p></li></ul></p></p>","PeriodicalId":50922,"journal":{"name":"ACM Transactions on Algorithms","volume":"9 1","pages":""},"PeriodicalIF":0.9000,"publicationDate":"2024-06-10","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":"{\"title\":\"Quantum Speed-ups for String Synchronizing Sets, Longest Common Substring, and \\\\(k\\\\) -mismatch Matching\",\"authors\":\"Ce Jin, Jakob Nogler\",\"doi\":\"10.1145/3672395\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"<p><i>Longest Common Substring (LCS)</i> is an important text processing problem, which has recently been investigated in the quantum query model. The decision version of this problem, <i>LCS with threshold \\\\(d\\\\)</i>, asks whether two length-\\\\(n\\\\) input strings have a common substring of length \\\\(d\\\\). The two extreme cases, \\\\(d=1\\\\) and \\\\(d=n\\\\), correspond respectively to Element Distinctness and Unstructured Search, two fundamental problems in quantum query complexity. However, the intermediate case \\\\(1\\\\ll d\\\\ll n\\\\) was not fully understood.</p><p>We show that the complexity of LCS with threshold \\\\(d\\\\) smoothly interpolates between the two extreme cases up to \\\\(n^{o(1)}\\\\) factors:\\n<p><ul><li><p>LCS with threshold \\\\(d\\\\) has a quantum algorithm in \\\\(n^{2/3+o(1)}/d^{1/6}\\\\) query complexity and time complexity, and requires at least \\\\(\\\\Omega(n^{2/3}/d^{1/6})\\\\) quantum query complexity.</p></li></ul></p></p><p>Our result improves upon previous upper bounds \\\\(\\\\tilde{O}(\\\\min\\\\{n/d^{1/2},n^{2/3}\\\\})\\\\) (Le Gall and Seddighin ITCS 2022, Akmal and Jin SODA 2022), and answers an open question of Akmal and Jin.</p><p>Our main technical contribution is a quantum speed-up of the powerful <i>String Synchronizing Set</i> technique introduced by Kempa and Kociumaka (STOC 2019). It consistently samples \\\\(n/\\\\tau^{1-o(1)}\\\\) synchronizing positions in the string depending on their length-\\\\(\\\\Theta(\\\\tau)\\\\) contexts, and each synchronizing position can be reported by a quantum algorithm in \\\\(\\\\tilde{O}(\\\\tau^{1/2+o(1)})\\\\) time. Our quantum string synchronizing set also yields a near-optimal LCE data structure in the quantum setting.</p><p>As another application of our quantum string synchronizing set, we study the <i>\\\\(k\\\\)-mismatch Matching</i> problem, which asks if the pattern has an occurrence in the text with at most \\\\(k\\\\) Hamming mismatches. Using a structural result of Charalampopoulos, Kociumaka, and Wellnitz (FOCS 2020), we obtain:\\n<p><ul><li><p>\\\\(k\\\\)-mismatch matching has a quantum algorithm with \\\\(k^{3/4}n^{1/2+o(1)}\\\\) query complexity and \\\\(\\\\tilde{O}(kn^{1/2})\\\\) time complexity. We also observe a non-matching quantum query lower bound of \\\\(\\\\Omega(\\\\sqrt{kn})\\\\).</p></li></ul></p></p>\",\"PeriodicalId\":50922,\"journal\":{\"name\":\"ACM Transactions on Algorithms\",\"volume\":\"9 1\",\"pages\":\"\"},\"PeriodicalIF\":0.9000,\"publicationDate\":\"2024-06-10\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"0\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"ACM Transactions on Algorithms\",\"FirstCategoryId\":\"94\",\"ListUrlMain\":\"https://doi.org/10.1145/3672395\",\"RegionNum\":3,\"RegionCategory\":\"计算机科学\",\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"Q3\",\"JCRName\":\"COMPUTER SCIENCE, THEORY & METHODS\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"ACM Transactions on Algorithms","FirstCategoryId":"94","ListUrlMain":"https://doi.org/10.1145/3672395","RegionNum":3,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q3","JCRName":"COMPUTER SCIENCE, THEORY & METHODS","Score":null,"Total":0}
引用次数: 0
摘要
最长公共子串(Longest Common Substring,LCS)是一个重要的文本处理问题,最近在量子查询模型中得到了研究。这个问题的决策版本,即 LCS with threshold \(d\),询问两个长度为 \(n\)的输入字符串是否有长度为 \(d\)的公共子串。两个极端情况,即 \(d=1\) 和 \(d=n\) 分别对应于元素唯一性(Element Distinctness)和无结构搜索(Unstructured Search),这是量子查询复杂性中的两个基本问题。然而,中间情况(1ll dll n)并没有被完全理解。我们证明了阈值为 (d\ )的 LCS 的复杂度在这两种极端情况之间平滑地插值到 (n^{o(1)}\)因子:阈值为 (d\ )的 LCS 在 (n^{2/3+o(1)}/d^{1/6}\)查询复杂度和时间复杂度方面具有量子算法,并且至少需要 (\Omega(n^{2/3}/d^{1/6})\)量子查询复杂度。我们的结果改进了之前的上界(Le Gall 和 Seddighin ITCS 2022,Akmal 和 Jin SODA 2022),并回答了 Akmal 和 Jin 的一个开放问题。我们的主要技术贡献是量子加速了 Kempa 和 Kociumaka(STOC 2019)介绍的强大的字符串同步集技术。它可以根据字符串的长度-(\theta(\tau)\)上下文,一致地采样字符串中的\(n/\tau^{1-o(1)}\)个同步位置,并且每个同步位置都可以通过量子算法在\(\tilde{O}(\tau^{1/2+o(1)})\)时间内报告出来。作为量子字符串同步集的另一个应用,我们研究了 \(k\)-mismatch Matching 问题,这个问题是问文本中是否出现了最多 \(k\)Hamming mismatch 的模式。利用Charalampopoulos、Kociumaka和Wellnitz(FOCS 2020)的一个结构性结果,我们得到:(k)-错配匹配有一个量子算法,其查询复杂度为(k^{3/4}n^{1/2+o(1)}\),时间复杂度为(tilde{O}(kn^{1/2})\)。我们还观察到一个非匹配量子查询的下限是 \(\Omega(\sqrt{kn})\)。
Quantum Speed-ups for String Synchronizing Sets, Longest Common Substring, and \(k\) -mismatch Matching
Longest Common Substring (LCS) is an important text processing problem, which has recently been investigated in the quantum query model. The decision version of this problem, LCS with threshold \(d\), asks whether two length-\(n\) input strings have a common substring of length \(d\). The two extreme cases, \(d=1\) and \(d=n\), correspond respectively to Element Distinctness and Unstructured Search, two fundamental problems in quantum query complexity. However, the intermediate case \(1\ll d\ll n\) was not fully understood.
We show that the complexity of LCS with threshold \(d\) smoothly interpolates between the two extreme cases up to \(n^{o(1)}\) factors:
LCS with threshold \(d\) has a quantum algorithm in \(n^{2/3+o(1)}/d^{1/6}\) query complexity and time complexity, and requires at least \(\Omega(n^{2/3}/d^{1/6})\) quantum query complexity.
Our result improves upon previous upper bounds \(\tilde{O}(\min\{n/d^{1/2},n^{2/3}\})\) (Le Gall and Seddighin ITCS 2022, Akmal and Jin SODA 2022), and answers an open question of Akmal and Jin.
Our main technical contribution is a quantum speed-up of the powerful String Synchronizing Set technique introduced by Kempa and Kociumaka (STOC 2019). It consistently samples \(n/\tau^{1-o(1)}\) synchronizing positions in the string depending on their length-\(\Theta(\tau)\) contexts, and each synchronizing position can be reported by a quantum algorithm in \(\tilde{O}(\tau^{1/2+o(1)})\) time. Our quantum string synchronizing set also yields a near-optimal LCE data structure in the quantum setting.
As another application of our quantum string synchronizing set, we study the \(k\)-mismatch Matching problem, which asks if the pattern has an occurrence in the text with at most \(k\) Hamming mismatches. Using a structural result of Charalampopoulos, Kociumaka, and Wellnitz (FOCS 2020), we obtain:
\(k\)-mismatch matching has a quantum algorithm with \(k^{3/4}n^{1/2+o(1)}\) query complexity and \(\tilde{O}(kn^{1/2})\) time complexity. We also observe a non-matching quantum query lower bound of \(\Omega(\sqrt{kn})\).
期刊介绍:
ACM Transactions on Algorithms welcomes submissions of original research of the highest quality dealing with algorithms that are inherently discrete and finite, and having mathematical content in a natural way, either in the objective or in the analysis. Most welcome are new algorithms and data structures, new and improved analyses, and complexity results. Specific areas of computation covered by the journal include
combinatorial searches and objects;
counting;
discrete optimization and approximation;
randomization and quantum computation;
parallel and distributed computation;
algorithms for
graphs,
geometry,
arithmetic,
number theory,
strings;
on-line analysis;
cryptography;
coding;
data compression;
learning algorithms;
methods of algorithmic analysis;
discrete algorithms for application areas such as
biology,
economics,
game theory,
communication,
computer systems and architecture,
hardware design,
scientific computing