Faster Longest Common Extension Queries in Strings over General Alphabets

Paweł Gawrychowski, T. Kociumaka, W. Rytter, Tomasz Waleń
{"title":"Faster Longest Common Extension Queries in Strings over General Alphabets","authors":"Paweł Gawrychowski, T. Kociumaka, W. Rytter, Tomasz Waleń","doi":"10.4230/LIPIcs.CPM.2016.5","DOIUrl":null,"url":null,"abstract":"Longest common extension queries (often called longest common prefix queries) constitute a fundamental building block in multiple string algorithms, for example computing runs and approximate pattern matching. We show that a sequence of $q$ LCE queries for a string of size $n$ over a general ordered alphabet can be realized in $O(q \\log \\log n+n\\log^*n)$ time making only $O(q+n)$ symbol comparisons. Consequently, all runs in a string over a general ordered alphabet can be computed in $O(n \\log \\log n)$ time making $O(n)$ symbol comparisons. Our results improve upon a solution by Kosolobov (Information Processing Letters, 2016), who gave an algorithm with $O(n \\log^{2/3} n)$ running time and conjectured that $O(n)$ time is possible. We make a significant progress towards resolving this conjecture. Our techniques extend to the case of general unordered alphabets, when the time increases to $O(q\\log n + n\\log^*n)$. The main tools are difference covers and the disjoint-sets data structure.","PeriodicalId":236737,"journal":{"name":"Annual Symposium on Combinatorial Pattern Matching","volume":"92 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2016-02-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"31","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Annual Symposium on Combinatorial Pattern Matching","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.4230/LIPIcs.CPM.2016.5","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 31

Abstract

Longest common extension queries (often called longest common prefix queries) constitute a fundamental building block in multiple string algorithms, for example computing runs and approximate pattern matching. We show that a sequence of $q$ LCE queries for a string of size $n$ over a general ordered alphabet can be realized in $O(q \log \log n+n\log^*n)$ time making only $O(q+n)$ symbol comparisons. Consequently, all runs in a string over a general ordered alphabet can be computed in $O(n \log \log n)$ time making $O(n)$ symbol comparisons. Our results improve upon a solution by Kosolobov (Information Processing Letters, 2016), who gave an algorithm with $O(n \log^{2/3} n)$ running time and conjectured that $O(n)$ time is possible. We make a significant progress towards resolving this conjecture. Our techniques extend to the case of general unordered alphabets, when the time increases to $O(q\log n + n\log^*n)$. The main tools are difference covers and the disjoint-sets data structure.
更快的最长公共扩展查询在一般字母的字符串
最长公共扩展查询(通常称为最长公共前缀查询)构成了多个字符串算法的基本构建块,例如计算运行和近似模式匹配。我们证明了$q$ LCE查询序列对于一个大小为$n$的字符串在一个一般有序字母表上可以在$O(q \log \log n+n\log^*n)$时间内实现,并且只进行$O(q+n)$符号比较。因此,在一个通用有序字母表的字符串中,所有的运行都可以在$O(n \log \log n)$时间内进行$O(n)$符号比较。我们的结果改进了Kosolobov (Information Processing Letters, 2016)的解决方案,他给出了一个运行时间为$O(n \log^{2/3} n)$的算法,并推测$O(n)$时间是可能的。我们在解决这一猜想方面取得了重大进展。当时间增加到$O(q\log n + n\log^*n)$时,我们的技术扩展到一般无序字母的情况。主要工具是差异覆盖和不相交集数据结构。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
求助全文
约1分钟内获得全文 求助全文
来源期刊
自引率
0.00%
发文量
0
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
确定
请完成安全验证×
copy
已复制链接
快去分享给好友吧!
我知道了
右上角分享
点击右上角分享
0
联系我们:info@booksci.cn Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。 Copyright © 2023 布克学术 All rights reserved.
京ICP备2023020795号-1
ghs 京公网安备 11010802042870号
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术官方微信