Annual Symposium on Combinatorial Pattern Matching最新文献

筛选
英文 中文
Tight Bounds on the Maximum Number of Shortest Unique Substrings 最短唯一子串的最大数目的紧边界
Annual Symposium on Combinatorial Pattern Matching Pub Date : 2016-09-23 DOI: 10.4230/LIPIcs.CPM.2017.24
Takuya Mieno, Shunsuke Inenaga, H. Bannai, M. Takeda
{"title":"Tight Bounds on the Maximum Number of Shortest Unique Substrings","authors":"Takuya Mieno, Shunsuke Inenaga, H. Bannai, M. Takeda","doi":"10.4230/LIPIcs.CPM.2017.24","DOIUrl":"https://doi.org/10.4230/LIPIcs.CPM.2017.24","url":null,"abstract":"A substring Q of a string S is called a shortest unique substring (SUS) for interval [s,t] in S, if Q occurs exactly once in S, this occurrence of Q contains interval [s,t], and every substring of S which contains interval [s,t] and is shorter than Q occurs at least twice in S. The SUS problem is, given a string S, to preprocess S so that for any subsequent query interval [s,t] all the SUSs for interval [s,t] can be answered quickly. When s = t, we call the SUSs for [s, t] as point SUSs, and when s <= t, we call the SUSs for [s, t] as interval SUSs. There exist optimal O(n)-time preprocessing scheme which answers queries in optimal O(k) time for both point and interval SUSs, where n is the length of S and k is the number of outputs for a given query. In this paper, we reveal structural, combinatorial properties underlying the SUS problem: Namely, we show that the number of intervals in S that correspond to point SUSs for all query positions in S is less than 1.5n, and show that this is a matching upper and lower bound. Also, we consider the maximum number of intervals in S that correspond to interval SUSs for all query intervals in S.","PeriodicalId":236737,"journal":{"name":"Annual Symposium on Combinatorial Pattern Matching","volume":"58 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2016-09-23","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"133179758","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 3
Graph Motif Problems Parameterized by Dual 对偶参数化的图基问题
Annual Symposium on Combinatorial Pattern Matching Pub Date : 2016-06-27 DOI: 10.4230/LIPIcs.CPM.2016.7
G. Fertin, Christian Komusiewicz
{"title":"Graph Motif Problems Parameterized by Dual","authors":"G. Fertin, Christian Komusiewicz","doi":"10.4230/LIPIcs.CPM.2016.7","DOIUrl":"https://doi.org/10.4230/LIPIcs.CPM.2016.7","url":null,"abstract":"Let G=(V,E) be a vertex-colored graph, where C is the set of colors used to color V. The Graph Motif (or GM) problem takes as input G, a multiset M of colors built from C, and asks whether there is a subset S subseteq V such that (i) G[S] is connected and (ii) the multiset of colors obtained from S equals M. The Colorful Graph Motif problem (or CGM) is a constrained version of GM in which M=C, and the List-Colored Graph Motif problem (or LGM) is the extension of GM in which each vertex v of V may choose its color from a list L(v) of colors. \u0000 \u0000We study the three problems GM, CGM and LGM, parameterized by l:=|V|-|M|. In particular, for general graphs, we show that, assuming the strong exponential-time hypothesis, CGM has no (2-epsilon)^l * |V|^{O(1)}-time algorithm, which implies that a previous algorithm, running in O(2^lcdot |E|) time is optimal. We also prove that LGM is W[1]-hard even if we restrict ourselves to lists of at most two colors. If we constrain the input graph to be a tree, then we show that, in contrast to CGM, GM can be solved in O(4^l *|V|) time but admits no polynomial kernel, while CGM can be solved in O(sqrt{2}^l + |V|) time and admits a polynomial kernel.","PeriodicalId":236737,"journal":{"name":"Annual Symposium on Combinatorial Pattern Matching","volume":"189 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2016-06-27","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"131604387","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 4
Factorizing a String into Squares in Linear Time 在线性时间内将一串分解成平方
Annual Symposium on Combinatorial Pattern Matching Pub Date : 2016-06-01 DOI: 10.4230/LIPIcs.CPM.2016.27
Yoshiaki Matsuoka, Shunsuke Inenaga, H. Bannai, M. Takeda, F. Manea
{"title":"Factorizing a String into Squares in Linear Time","authors":"Yoshiaki Matsuoka, Shunsuke Inenaga, H. Bannai, M. Takeda, F. Manea","doi":"10.4230/LIPIcs.CPM.2016.27","DOIUrl":"https://doi.org/10.4230/LIPIcs.CPM.2016.27","url":null,"abstract":"A square factorization of a string w is a factorization of w in which each factor is a square. Dumitran et al. [SPIRE 2015, pp. 54-66] showed how to find a square factorization of a given string of length n in O(n log n) time, and they posed a question whether it can be done in O(n) time. In this paper, we answer their question positively, showing an O(n)-time algorithm for square factorization in the standard word RAM model with machine word size omega = Omega(log n). We also show an O(n + (n log^2 n) / omega)-time (respectively, O(n log n)-time) algorithm to find a square factorization which contains the maximum (respectively, minimum) number of squares.","PeriodicalId":236737,"journal":{"name":"Annual Symposium on Combinatorial Pattern Matching","volume":"58 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2016-06-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"127192205","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 5
On the Benefit of Merging Suffix Array Intervals for Parallel Pattern Matching 合并后缀数组间隔对并行模式匹配的好处
Annual Symposium on Combinatorial Pattern Matching Pub Date : 2016-06-01 DOI: 10.4230/LIPIcs.CPM.2016.26
J. Fischer, D. Köppl, Florian Kurpicz
{"title":"On the Benefit of Merging Suffix Array Intervals for Parallel Pattern Matching","authors":"J. Fischer, D. Köppl, Florian Kurpicz","doi":"10.4230/LIPIcs.CPM.2016.26","DOIUrl":"https://doi.org/10.4230/LIPIcs.CPM.2016.26","url":null,"abstract":"We present parallel algorithms for exact and approximate pattern matching with suffix arrays, using a CREW-PRAM with $p$ processors. Given a static text of length $n$, we first show how to compute the suffix array interval of a given pattern of length $m$ in $O(frac{m}{p}+ lg p + lglg pcdotlglg n)$ time for $p le m$. For approximate pattern matching with $k$ differences or mismatches, we show how to compute all occurrences of a given pattern in $O(frac{m^ksigma^k}{p}maxleft(k,lglg nright)!+!(1+frac{m}{p}) lg pcdot lglg n + text{occ})$ time, where $sigma$ is the size of the alphabet and $p le sigma^k m^k$. The workhorse of our algorithms is a data structure for merging suffix array intervals quickly: Given the suffix array intervals for two patterns $P$ and $P'$, we present a data structure for computing the interval of $PP'$ in $O(lglg n)$ sequential time, or in $O(1+lg_plg n)$ parallel time. All our data structures are of size $O(n)$ bits (in addition to the suffix array).","PeriodicalId":236737,"journal":{"name":"Annual Symposium on Combinatorial Pattern Matching","volume":"1 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2016-06-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"132461651","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 14
Efficient Index for Weighted Sequences 加权序列的高效索引
Annual Symposium on Combinatorial Pattern Matching Pub Date : 2016-02-02 DOI: 10.4230/LIPIcs.CPM.2016.4
Carl Barton, T. Kociumaka, S. Pissis, J. Radoszewski
{"title":"Efficient Index for Weighted Sequences","authors":"Carl Barton, T. Kociumaka, S. Pissis, J. Radoszewski","doi":"10.4230/LIPIcs.CPM.2016.4","DOIUrl":"https://doi.org/10.4230/LIPIcs.CPM.2016.4","url":null,"abstract":"The problem of finding factors of a text string which are identical or similar to a given pattern string is a central problem in computer science. A generalised version of this problem consists in implementing an index over the text to support efficient on-line pattern queries. We study this problem in the case where the text is weighted: for every position of the text and every letter of the alphabet a probability of occurrence of this letter at this position is given. Sequences of this type, also called position weight matrices, are commonly used to represent imprecise or uncertain data. A weighted sequence may represent many different strings, each with probability of occurrence equal to the product of probabilities of its letters at subsequent positions. Given a probability threshold $1/z$, we say that a pattern string $P$ matches a weighted text at position $i$ if the product of probabilities of the letters of $P$ at positions $i,ldots,i+|P|-1$ in the text is at least $1/z$. In this article, we present an $O(nz)$-time construction of an $O(nz)$-sized index that can answer pattern matching queries in a weighted text in optimal time improving upon the state of the art by a factor of $z log z$. Other applications of this data structure include an $O(nz)$-time construction of the weighted prefix table and an $O(nz)$-time computation of all covers of a weighted sequence, which improve upon the state of the art by the same factor.","PeriodicalId":236737,"journal":{"name":"Annual Symposium on Combinatorial Pattern Matching","volume":"9 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2016-02-02","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"126944893","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 21
Faster Longest Common Extension Queries in Strings over General Alphabets 更快的最长公共扩展查询在一般字母的字符串
Annual Symposium on Combinatorial Pattern Matching Pub Date : 2016-02-01 DOI: 10.4230/LIPIcs.CPM.2016.5
Paweł Gawrychowski, T. Kociumaka, W. Rytter, Tomasz Waleń
{"title":"Faster Longest Common Extension Queries in Strings over General Alphabets","authors":"Paweł Gawrychowski, T. Kociumaka, W. Rytter, Tomasz Waleń","doi":"10.4230/LIPIcs.CPM.2016.5","DOIUrl":"https://doi.org/10.4230/LIPIcs.CPM.2016.5","url":null,"abstract":"Longest common extension queries (often called longest common prefix queries) constitute a fundamental building block in multiple string algorithms, for example computing runs and approximate pattern matching. We show that a sequence of $q$ LCE queries for a string of size $n$ over a general ordered alphabet can be realized in $O(q log log n+nlog^*n)$ time making only $O(q+n)$ symbol comparisons. Consequently, all runs in a string over a general ordered alphabet can be computed in $O(n log log n)$ time making $O(n)$ symbol comparisons. Our results improve upon a solution by Kosolobov (Information Processing Letters, 2016), who gave an algorithm with $O(n log^{2/3} n)$ running time and conjectured that $O(n)$ time is possible. We make a significant progress towards resolving this conjecture. Our techniques extend to the case of general unordered alphabets, when the time increases to $O(qlog n + nlog^*n)$. The main tools are difference covers and the disjoint-sets data structure.","PeriodicalId":236737,"journal":{"name":"Annual Symposium on Combinatorial Pattern Matching","volume":"92 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2016-02-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"114868672","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 31
Deterministic sub-linear space LCE data structures with efficient construction 具有高效构造的确定性亚线性空间LCE数据结构
Annual Symposium on Combinatorial Pattern Matching Pub Date : 2016-01-28 DOI: 10.4230/LIPIcs.CPM.2016.1
Yuka Tanimura, T. I., H. Bannai, Shunsuke Inenaga, S. Puglisi, M. Takeda
{"title":"Deterministic sub-linear space LCE data structures with efficient construction","authors":"Yuka Tanimura, T. I., H. Bannai, Shunsuke Inenaga, S. Puglisi, M. Takeda","doi":"10.4230/LIPIcs.CPM.2016.1","DOIUrl":"https://doi.org/10.4230/LIPIcs.CPM.2016.1","url":null,"abstract":"Given a string $S$ of $n$ symbols, a longest common extension query $mathsf{LCE}(i,j)$ asks for the length of the longest common prefix of the $i$th and $j$th suffixes of $S$. LCE queries have several important applications in string processing, perhaps most notably to suffix sorting. Recently, Bille et al. (J. Discrete Algorithms 25:42-50, 2014, Proc. CPM 2015: 65-76) described several data structures for answering LCE queries that offers a space-time trade-off between data structure size and query time. In particular, for a parameter $1 leq tau leq n$, their best deterministic solution is a data structure of size $O(n/tau)$ which allows LCE queries to be answered in $O(tau)$ time. However, the construction time for all deterministic versions of their data structure is quadratic in $n$. In this paper, we propose a deterministic solution that achieves a similar space-time trade-off of $O(taumin{logtau,logfrac{n}{tau}})$ query time using $O(n/tau)$ space, but significantly improve the construction time to $O(ntau)$.","PeriodicalId":236737,"journal":{"name":"Annual Symposium on Combinatorial Pattern Matching","volume":"21 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2016-01-28","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"122097761","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 18
Hardness of RNA Folding Problem With Four Symbols 四符号RNA折叠问题的硬度
Annual Symposium on Combinatorial Pattern Matching Pub Date : 2015-11-15 DOI: 10.4230/LIPIcs.CPM.2016.13
Yi-Jun Chang
{"title":"Hardness of RNA Folding Problem With Four Symbols","authors":"Yi-Jun Chang","doi":"10.4230/LIPIcs.CPM.2016.13","DOIUrl":"https://doi.org/10.4230/LIPIcs.CPM.2016.13","url":null,"abstract":"An RNA sequence is a string composed of four types of nucleotides, $A, C, G$, and $U$. The goal of the RNA folding problem is to find a maximum cardinality set of crossing-free pairs of the form ${A,U}$ or ${C,G}$ in a given RNA sequence. The problem is central in bioinformatics and has received much attention over the years. Abboud, Backurs, and Williams (FOCS 2015) demonstrated a conditional lower bound for a generalized version of the RNA folding problem based on a conjectured hardness of the $k$-clique problem. Their lower bound requires the RNA sequence to have at least 36 types of symbols, making the result not applicable to the RNA folding problem in real life (i.e., alphabet size 4). In this paper, we present an improved lower bound that works for the alphabet size 4 case. \u0000We also investigate the Dyck edit distance problem, which is a string problem closely related to RNA folding. We demonstrate a reduction from RNA folding to Dyck edit distance with alphabet size 10. This leads to a much simpler proof of the conditional lower bound for Dyck edit distance problem given by Abboud, Backurs, and Williams (FOCS 2015), and lowers the alphabet size requirement.","PeriodicalId":236737,"journal":{"name":"Annual Symposium on Combinatorial Pattern Matching","volume":"27 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2015-11-15","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"116696836","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 13
Range Minimum Query Indexes in Higher Dimensions Range高维最小查询索引
Annual Symposium on Combinatorial Pattern Matching Pub Date : 2015-06-29 DOI: 10.1007/978-3-319-19929-0_13
P. Davoodi, J. Iacono, G. M. Landau, Moshe Lewenstein
{"title":"Range Minimum Query Indexes in Higher Dimensions","authors":"P. Davoodi, J. Iacono, G. M. Landau, Moshe Lewenstein","doi":"10.1007/978-3-319-19929-0_13","DOIUrl":"https://doi.org/10.1007/978-3-319-19929-0_13","url":null,"abstract":"","PeriodicalId":236737,"journal":{"name":"Annual Symposium on Combinatorial Pattern Matching","volume":"18 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2015-06-29","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"123715522","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Parallel External Memory Suffix Sorting 并行外部存储器后缀排序
Annual Symposium on Combinatorial Pattern Matching Pub Date : 2015-06-29 DOI: 10.1007/978-3-319-19929-0_28
Juha Kärkkäinen, Dominik Kempa, S. Puglisi
{"title":"Parallel External Memory Suffix Sorting","authors":"Juha Kärkkäinen, Dominik Kempa, S. Puglisi","doi":"10.1007/978-3-319-19929-0_28","DOIUrl":"https://doi.org/10.1007/978-3-319-19929-0_28","url":null,"abstract":"","PeriodicalId":236737,"journal":{"name":"Annual Symposium on Combinatorial Pattern Matching","volume":"130 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2015-06-29","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"124491428","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 44
0
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
确定
请完成安全验证×
相关产品
×
本文献相关产品
联系我们:info@booksci.cn Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。 Copyright © 2023 布克学术 All rights reserved.
京ICP备2023020795号-1
ghs 京公网安备 11010802042870号
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术官方微信