流k不匹配问题:空间和总时间之间的权衡

Shay Golan, T. Kociumaka, T. Kopelowitz, E. Porat
{"title":"流k不匹配问题:空间和总时间之间的权衡","authors":"Shay Golan, T. Kociumaka, T. Kopelowitz, E. Porat","doi":"10.4230/LIPIcs.CPM.2020.15","DOIUrl":null,"url":null,"abstract":"We revisit the $k$-mismatch problem in the streaming model on a pattern of length $m$ and a streaming text of length $n$, both over a size-$\\sigma$ alphabet. The current state-of-the-art algorithm for the streaming $k$-mismatch problem, by Clifford et al. [SODA 2019], uses $\\tilde O(k)$ space and $\\tilde O\\big(\\sqrt k\\big)$ worst-case time per character. The space complexity is known to be (unconditionally) optimal, and the worst-case time per character matches a conditional lower bound. However, there is a gap between the total time cost of the algorithm, which is $\\tilde O(n\\sqrt k)$, and the fastest known offline algorithm, which costs $\\tilde O\\big(n + \\min\\big(\\frac{nk}{\\sqrt m},\\sigma n\\big)\\big)$ time. Moreover, it is not known whether improvements over the $\\tilde O(n\\sqrt k)$ total time are possible when using more than $O(k)$ space. \nWe address these gaps by designing a randomized streaming algorithm for the $k$-mismatch problem that, given an integer parameter $k\\le s \\le m$, uses $\\tilde O(s)$ space and costs $\\tilde O\\big(n+\\min\\big(\\frac {nk^2}m,\\frac{nk}{\\sqrt s},\\frac{\\sigma nm}s\\big)\\big)$ total time. For $s=m$, the total runtime becomes $\\tilde O\\big(n + \\min\\big(\\frac{nk}{\\sqrt m},\\sigma n\\big)\\big)$, which matches the time cost of the fastest offline algorithm. Moreover, the worst-case time cost per character is still $\\tilde O\\big(\\sqrt k\\big)$.","PeriodicalId":236737,"journal":{"name":"Annual Symposium on Combinatorial Pattern Matching","volume":"20 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2020-04-27","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"5","resultStr":"{\"title\":\"The Streaming k-Mismatch Problem: Tradeoffs between Space and Total Time\",\"authors\":\"Shay Golan, T. Kociumaka, T. Kopelowitz, E. Porat\",\"doi\":\"10.4230/LIPIcs.CPM.2020.15\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"We revisit the $k$-mismatch problem in the streaming model on a pattern of length $m$ and a streaming text of length $n$, both over a size-$\\\\sigma$ alphabet. The current state-of-the-art algorithm for the streaming $k$-mismatch problem, by Clifford et al. [SODA 2019], uses $\\\\tilde O(k)$ space and $\\\\tilde O\\\\big(\\\\sqrt k\\\\big)$ worst-case time per character. The space complexity is known to be (unconditionally) optimal, and the worst-case time per character matches a conditional lower bound. However, there is a gap between the total time cost of the algorithm, which is $\\\\tilde O(n\\\\sqrt k)$, and the fastest known offline algorithm, which costs $\\\\tilde O\\\\big(n + \\\\min\\\\big(\\\\frac{nk}{\\\\sqrt m},\\\\sigma n\\\\big)\\\\big)$ time. Moreover, it is not known whether improvements over the $\\\\tilde O(n\\\\sqrt k)$ total time are possible when using more than $O(k)$ space. \\nWe address these gaps by designing a randomized streaming algorithm for the $k$-mismatch problem that, given an integer parameter $k\\\\le s \\\\le m$, uses $\\\\tilde O(s)$ space and costs $\\\\tilde O\\\\big(n+\\\\min\\\\big(\\\\frac {nk^2}m,\\\\frac{nk}{\\\\sqrt s},\\\\frac{\\\\sigma nm}s\\\\big)\\\\big)$ total time. For $s=m$, the total runtime becomes $\\\\tilde O\\\\big(n + \\\\min\\\\big(\\\\frac{nk}{\\\\sqrt m},\\\\sigma n\\\\big)\\\\big)$, which matches the time cost of the fastest offline algorithm. Moreover, the worst-case time cost per character is still $\\\\tilde O\\\\big(\\\\sqrt k\\\\big)$.\",\"PeriodicalId\":236737,\"journal\":{\"name\":\"Annual Symposium on Combinatorial Pattern Matching\",\"volume\":\"20 1\",\"pages\":\"0\"},\"PeriodicalIF\":0.0000,\"publicationDate\":\"2020-04-27\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"5\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"Annual Symposium on Combinatorial Pattern Matching\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/10.4230/LIPIcs.CPM.2020.15\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"Annual Symposium on Combinatorial Pattern Matching","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.4230/LIPIcs.CPM.2020.15","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 5

摘要

我们在长度为$m$的模式和长度为$n$的流文本上重新审视流模型中的$k$ -不匹配问题,它们都在大小为$\sigma$的字母表上。Clifford等人[SODA 2019]目前最先进的流媒体$k$ -mismatch问题算法使用$\tilde O(k)$空间和$\tilde O\big(\sqrt k\big)$每个字符的最坏情况时间。已知空间复杂度是(无条件)最优的,每个字符的最坏情况时间匹配一个条件下界。然而,该算法的总时间成本为$\tilde O(n\sqrt k)$,与已知最快的离线算法(花费$\tilde O\big(n + \min\big(\frac{nk}{\sqrt m},\sigma n\big)\big)$时间)之间存在差距。此外,还不知道当使用超过$O(k)$的空间时,对$\tilde O(n\sqrt k)$总时间的改进是否可能。我们通过为$k$ -mismatch问题设计一个随机流算法来解决这些差距,该算法给定一个整数参数$k\le s \le m$,使用$\tilde O(s)$空间并花费$\tilde O\big(n+\min\big(\frac {nk^2}m,\frac{nk}{\sqrt s},\frac{\sigma nm}s\big)\big)$总时间。对于$s=m$,总运行时间变为$\tilde O\big(n + \min\big(\frac{nk}{\sqrt m},\sigma n\big)\big)$,这与最快的离线算法的时间成本相匹配。此外,每个字符的最坏情况时间成本仍然是$\tilde O\big(\sqrt k\big)$。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
The Streaming k-Mismatch Problem: Tradeoffs between Space and Total Time
We revisit the $k$-mismatch problem in the streaming model on a pattern of length $m$ and a streaming text of length $n$, both over a size-$\sigma$ alphabet. The current state-of-the-art algorithm for the streaming $k$-mismatch problem, by Clifford et al. [SODA 2019], uses $\tilde O(k)$ space and $\tilde O\big(\sqrt k\big)$ worst-case time per character. The space complexity is known to be (unconditionally) optimal, and the worst-case time per character matches a conditional lower bound. However, there is a gap between the total time cost of the algorithm, which is $\tilde O(n\sqrt k)$, and the fastest known offline algorithm, which costs $\tilde O\big(n + \min\big(\frac{nk}{\sqrt m},\sigma n\big)\big)$ time. Moreover, it is not known whether improvements over the $\tilde O(n\sqrt k)$ total time are possible when using more than $O(k)$ space. We address these gaps by designing a randomized streaming algorithm for the $k$-mismatch problem that, given an integer parameter $k\le s \le m$, uses $\tilde O(s)$ space and costs $\tilde O\big(n+\min\big(\frac {nk^2}m,\frac{nk}{\sqrt s},\frac{\sigma nm}s\big)\big)$ total time. For $s=m$, the total runtime becomes $\tilde O\big(n + \min\big(\frac{nk}{\sqrt m},\sigma n\big)\big)$, which matches the time cost of the fastest offline algorithm. Moreover, the worst-case time cost per character is still $\tilde O\big(\sqrt k\big)$.
求助全文
通过发布文献求助,成功后即可免费获取论文全文。 去求助
来源期刊
自引率
0.00%
发文量
0
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
确定
请完成安全验证×
copy
已复制链接
快去分享给好友吧!
我知道了
右上角分享
点击右上角分享
0
联系我们:info@booksci.cn Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。 Copyright © 2023 布克学术 All rights reserved.
京ICP备2023020795号-1
ghs 京公网安备 11010802042870号
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术官方微信