On the Benefit of Merging Suffix Array Intervals for Parallel Pattern Matching

J. Fischer, D. Köppl, Florian Kurpicz
{"title":"On the Benefit of Merging Suffix Array Intervals for Parallel Pattern Matching","authors":"J. Fischer, D. Köppl, Florian Kurpicz","doi":"10.4230/LIPIcs.CPM.2016.26","DOIUrl":null,"url":null,"abstract":"We present parallel algorithms for exact and approximate pattern matching with suffix arrays, using a CREW-PRAM with $p$ processors. Given a static text of length $n$, we first show how to compute the suffix array interval of a given pattern of length $m$ in $O(\\frac{m}{p}+ \\lg p + \\lg\\lg p\\cdot\\lg\\lg n)$ time for $p \\le m$. For approximate pattern matching with $k$ differences or mismatches, we show how to compute all occurrences of a given pattern in $O(\\frac{m^k\\sigma^k}{p}\\max\\left(k,\\lg\\lg n\\right)\\!+\\!(1+\\frac{m}{p}) \\lg p\\cdot \\lg\\lg n + \\text{occ})$ time, where $\\sigma$ is the size of the alphabet and $p \\le \\sigma^k m^k$. The workhorse of our algorithms is a data structure for merging suffix array intervals quickly: Given the suffix array intervals for two patterns $P$ and $P'$, we present a data structure for computing the interval of $PP'$ in $O(\\lg\\lg n)$ sequential time, or in $O(1+\\lg_p\\lg n)$ parallel time. All our data structures are of size $O(n)$ bits (in addition to the suffix array).","PeriodicalId":236737,"journal":{"name":"Annual Symposium on Combinatorial Pattern Matching","volume":"1 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2016-06-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"14","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Annual Symposium on Combinatorial Pattern Matching","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.4230/LIPIcs.CPM.2016.26","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 14

Abstract

We present parallel algorithms for exact and approximate pattern matching with suffix arrays, using a CREW-PRAM with $p$ processors. Given a static text of length $n$, we first show how to compute the suffix array interval of a given pattern of length $m$ in $O(\frac{m}{p}+ \lg p + \lg\lg p\cdot\lg\lg n)$ time for $p \le m$. For approximate pattern matching with $k$ differences or mismatches, we show how to compute all occurrences of a given pattern in $O(\frac{m^k\sigma^k}{p}\max\left(k,\lg\lg n\right)\!+\!(1+\frac{m}{p}) \lg p\cdot \lg\lg n + \text{occ})$ time, where $\sigma$ is the size of the alphabet and $p \le \sigma^k m^k$. The workhorse of our algorithms is a data structure for merging suffix array intervals quickly: Given the suffix array intervals for two patterns $P$ and $P'$, we present a data structure for computing the interval of $PP'$ in $O(\lg\lg n)$ sequential time, or in $O(1+\lg_p\lg n)$ parallel time. All our data structures are of size $O(n)$ bits (in addition to the suffix array).
合并后缀数组间隔对并行模式匹配的好处
我们提出了精确和近似模式匹配的并行算法与后缀数组,使用CREW-PRAM与$p$处理器。给定长度为$n$的静态文本,我们首先展示如何在$O(\frac{m}{p}+ \lg p + \lg\lg p\cdot\lg\lg n)$时间内为$p \le m$计算长度为$m$的给定模式的后缀数组间隔。对于具有$k$差异或不匹配的近似模式匹配,我们展示了如何在$O(\frac{m^k\sigma^k}{p}\max\left(k,\lg\lg n\right)\!+\!(1+\frac{m}{p}) \lg p\cdot \lg\lg n + \text{occ})$时间内计算给定模式的所有出现情况,其中$\sigma$是字母表的大小,$p \le \sigma^k m^k$是字母表的大小。我们算法的主要工作是用于快速合并后缀数组间隔的数据结构:给定两个模式$P$和$P'$的后缀数组间隔,我们提出了用于计算$PP'$在$O(\lg\lg n)$顺序时间或$O(1+\lg_p\lg n)$并行时间内的间隔的数据结构。我们所有的数据结构的大小都是$O(n)$位(除了后缀数组)。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
求助全文
约1分钟内获得全文 求助全文
来源期刊
自引率
0.00%
发文量
0
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
确定
请完成安全验证×
copy
已复制链接
快去分享给好友吧!
我知道了
右上角分享
点击右上角分享
0
联系我们:info@booksci.cn Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。 Copyright © 2023 布克学术 All rights reserved.
京ICP备2023020795号-1
ghs 京公网安备 11010802042870号
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术官方微信