Annual Symposium on Combinatorial Pattern Matching最新文献

筛选
英文 中文
The Longest Filled Common Subsequence Problem 最长填充公共子序列问题
Annual Symposium on Combinatorial Pattern Matching Pub Date : 2017-07-01 DOI: 10.4230/LIPIcs.CPM.2017.14
M. Castelli, R. Dondi, G. Mauri, I. Zoppis
{"title":"The Longest Filled Common Subsequence Problem","authors":"M. Castelli, R. Dondi, G. Mauri, I. Zoppis","doi":"10.4230/LIPIcs.CPM.2017.14","DOIUrl":"https://doi.org/10.4230/LIPIcs.CPM.2017.14","url":null,"abstract":"Inspired by a recent approach for genome reconstruction from incomplete data, we consider a variant of the longest common subsequence problem for the comparison of two sequences, one of which is incomplete, i.e. it has some missing elements. The new combinatorial problem, called Longest Filled Common Subsequence, given two sequences A and B, and a multiset M of symbols missing in B, asks for a sequence B* obtained by inserting the symbols of M into B so that B* induces a common subsequence with A of maximum length. First, we investigate the computational and approximation complexity of the problem and we show that it is NP-hard and APX-hard when A contains at most two occurrences of each symbol. Then, we give a 3/5-approximation algorithm for the problem. Finally, we present a fixed-parameter algorithm, when the problem is parameterized by the number of symbols inserted in B that \"match\" symbols of A.","PeriodicalId":236737,"journal":{"name":"Annual Symposium on Combinatorial Pattern Matching","volume":"92 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2017-07-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"125536428","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 7
28th Annual Symposium on Combinatorial Pattern Matching, CPM 2017, July 4-6, 2017, Warsaw, Poland 第28届组合模式匹配年度研讨会,CPM 2017, 2017年7月4-6日,波兰华沙
Annual Symposium on Combinatorial Pattern Matching Pub Date : 2017-07-01 DOI: 10.4230/LIPICS.CPM.2017.0
Juha Kärkkäinen, J. Radoszewski, W. Rytter
{"title":"28th Annual Symposium on Combinatorial Pattern Matching, CPM 2017, July 4-6, 2017, Warsaw, Poland","authors":"Juha Kärkkäinen, J. Radoszewski, W. Rytter","doi":"10.4230/LIPICS.CPM.2017.0","DOIUrl":"https://doi.org/10.4230/LIPICS.CPM.2017.0","url":null,"abstract":"","PeriodicalId":236737,"journal":{"name":"Annual Symposium on Combinatorial Pattern Matching","volume":"41 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2017-07-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"130458922","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 1
Document Listing on Repetitive Collections with Guaranteed Performance 保证性能的重复集合的文档列表
Annual Symposium on Combinatorial Pattern Matching Pub Date : 2017-06-28 DOI: 10.4230/LIPIcs.CPM.2017.4
G. Navarro
{"title":"Document Listing on Repetitive Collections with Guaranteed Performance","authors":"G. Navarro","doi":"10.4230/LIPIcs.CPM.2017.4","DOIUrl":"https://doi.org/10.4230/LIPIcs.CPM.2017.4","url":null,"abstract":"We consider document listing on string collections, that is, finding in which strings a given pattern appears. In particular, we focus on repetitive collections: a collection of size N over alphabet [1,a] is composed of D copies of a string of size n, and s single-character edits are applied on the copies. We introduce the first document listing index with size O~(n + s), precisely O((n lg a + s lg^2 N) lg D) bits, and with useful worst-case time guarantees: Given a pattern of length m, the index reports the ndoc strings where it appears in time O(m^2 + m lg N (lg D + lg^e N) ndoc), for any constant e > 0.","PeriodicalId":236737,"journal":{"name":"Annual Symposium on Combinatorial Pattern Matching","volume":"1 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2017-06-28","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"122817428","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 21
Representing the suffix tree with the CDAWG 用CDAWG表示后缀树
Annual Symposium on Combinatorial Pattern Matching Pub Date : 2017-05-24 DOI: 10.4230/LIPIcs.CPM.2017.7
D. Belazzougui, F. Cunial
{"title":"Representing the suffix tree with the CDAWG","authors":"D. Belazzougui, F. Cunial","doi":"10.4230/LIPIcs.CPM.2017.7","DOIUrl":"https://doi.org/10.4230/LIPIcs.CPM.2017.7","url":null,"abstract":"Given a string $T$, it is known that its suffix tree can be represented using the compact directed acyclic word graph (CDAWG) with $e_T$ arcs, taking overall $O(e_T+e_{{overline{T}}})$ words of space, where ${overline{T}}$ is the reverse of $T$, and supporting some key operations in time between $O(1)$ and $O(log{log{n}})$ in the worst case. This representation is especially appealing for highly repetitive strings, like collections of similar genomes or of version-controlled documents, in which $e_T$ grows sublinearly in the length of $T$ in practice. In this paper we augment such representation, supporting a number of additional queries in worst-case time between $O(1)$ and $O(log{n})$ in the RAM model, without increasing space complexity asymptotically. Our technique, based on a heavy path decomposition of the suffix tree, enables also a representation of the suffix array, of the inverse suffix array, and of $T$ itself, that takes $O(e_T)$ words of space, and that supports random access in $O(log{n})$ time. Furthermore, we establish a connection between the reversed CDAWG of $T$ and a context-free grammar that produces $T$ and only $T$, which might have independent interest.","PeriodicalId":236737,"journal":{"name":"Annual Symposium on Combinatorial Pattern Matching","volume":"85 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2017-05-24","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"130914613","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 29
Faster STR-IC-LCS computation via RLE 更快的STR-IC-LCS计算通过RLE
Annual Symposium on Combinatorial Pattern Matching Pub Date : 2017-03-15 DOI: 10.4230/LIPIcs.CPM.2017.20
Keita Kuboi, Yuta Fujishige, Shunsuke Inenaga, H. Bannai, M. Takeda
{"title":"Faster STR-IC-LCS computation via RLE","authors":"Keita Kuboi, Yuta Fujishige, Shunsuke Inenaga, H. Bannai, M. Takeda","doi":"10.4230/LIPIcs.CPM.2017.20","DOIUrl":"https://doi.org/10.4230/LIPIcs.CPM.2017.20","url":null,"abstract":"The constrained LCS problem asks one to find a longest common subsequence of two input strings $A$ and $B$ with some constraints. The STR-IC-LCS problem is a variant of the constrained LCS problem, where the solution must include a given constraint string $C$ as a substring. Given two strings $A$ and $B$ of respective lengths $M$ and $N$, and a constraint string $C$ of length at most $min{M, N}$, the best known algorithm for the STR-IC-LCS problem, proposed by Deorowicz~({em Inf. Process. Lett.}, 11:423--426, 2012), runs in $O(MN)$ time. In this work, we present an $O(mN + nM)$-time solution to the STR-IC-LCS problem, where $m$ and $n$ denote the sizes of the run-length encodings of $A$ and $B$, respectively. Since $m leq M$ and $n leq N$ always hold, our algorithm is always as fast as Deorowicz's algorithm, and is faster when input strings are compressible via RLE.","PeriodicalId":236737,"journal":{"name":"Annual Symposium on Combinatorial Pattern Matching","volume":"33 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2017-03-15","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"128097044","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 6
Fast and Simple Jumbled Indexing for Binary Run-Length Encoded Strings 快速和简单的二进制运行长度编码字符串的混乱索引
Annual Symposium on Combinatorial Pattern Matching Pub Date : 2017-02-04 DOI: 10.4230/LIPIcs.CPM.2017.19
L. Cunha, S. Dantas, T. Gagie, Roland Wittler, L. Kowada, J. Stoye
{"title":"Fast and Simple Jumbled Indexing for Binary Run-Length Encoded Strings","authors":"L. Cunha, S. Dantas, T. Gagie, Roland Wittler, L. Kowada, J. Stoye","doi":"10.4230/LIPIcs.CPM.2017.19","DOIUrl":"https://doi.org/10.4230/LIPIcs.CPM.2017.19","url":null,"abstract":"Important papers have appeared recently on the problem of indexing binary strings for jumbled pattern matching, and further lowering the time bounds in terms of the input size would now be a breakthrough with broad implications. We can still make progress on the problem, however, by considering other natural parameters. Badkobeh et al. (IPL, 2013) and Amir et al. (TCS, 2016) gave algorithms that index a binary string in O(n + r^2 log r) time, where n is the length and r is the number of runs, and Giaquinta and Grabowski (IPL, 2013) gave one that runs in O(n + r^2) time. In this paper we propose a new and very simple algorithm that also runs in O(n + r^2) time and can be extended either so that the index returns the position of a match (if there is one), or so that the algorithm uses only O(n) bits of space instead of O(n) words.","PeriodicalId":236737,"journal":{"name":"Annual Symposium on Combinatorial Pattern Matching","volume":"115 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2017-02-04","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"117265628","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 2
From LZ77 to the Run-Length Encoded Burrows-Wheeler Transform, and Back 从LZ77到行长编码Burrows-Wheeler变换,再回来
Annual Symposium on Combinatorial Pattern Matching Pub Date : 2017-02-04 DOI: 10.4230/LIPIcs.CPM.2017.17
A. Policriti, N. Prezza
{"title":"From LZ77 to the Run-Length Encoded Burrows-Wheeler Transform, and Back","authors":"A. Policriti, N. Prezza","doi":"10.4230/LIPIcs.CPM.2017.17","DOIUrl":"https://doi.org/10.4230/LIPIcs.CPM.2017.17","url":null,"abstract":"The Lempel-Ziv factorization (LZ77) and the Run-Length encoded Burrows-Wheeler Transform (RLBWT) are two important tools in text compression and indexing, being their sizes $z$ and $r$ closely related to the amount of text self-repetitiveness. In this paper we consider the problem of converting the two representations into each other within a working space proportional to the input and the output. Let $n$ be the text length. We show that $RLBWT$ can be converted to $LZ77$ in $mathcal{O}(nlog r)$ time and $mathcal{O}(r)$ words of working space. Conversely, we provide an algorithm to convert $LZ77$ to $RLBWT$ in $mathcal{O}big(n(log r + log z)big)$ time and $mathcal{O}(r+z)$ words of working space. Note that $r$ and $z$ can be emph{constant} if the text is highly repetitive, and our algorithms can operate with (up to) emph{exponentially} less space than naive solutions based on full decompression.","PeriodicalId":236737,"journal":{"name":"Annual Symposium on Combinatorial Pattern Matching","volume":"23 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2017-02-04","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"114185226","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 11
A family of approximation algorithms for the maximum duo-preservation string mapping problem 最大双保存字符串映射问题的一组近似算法
Annual Symposium on Combinatorial Pattern Matching Pub Date : 2017-02-01 DOI: 10.4230/LIPIcs.CPM.2017.10
Bartłomiej Dudek, Paweł Gawrychowski, Piotr Ostropolski-Nalewaja
{"title":"A family of approximation algorithms for the maximum duo-preservation string mapping problem","authors":"Bartłomiej Dudek, Paweł Gawrychowski, Piotr Ostropolski-Nalewaja","doi":"10.4230/LIPIcs.CPM.2017.10","DOIUrl":"https://doi.org/10.4230/LIPIcs.CPM.2017.10","url":null,"abstract":"In the Maximum Duo-Preservation String Mapping problem we are given two strings and wish to map the letters of the former to the letters of the latter so as to maximise the number of duos. A duo is a pair of consecutive letters that is mapped to a pair of consecutive letters in the same order. This is complementary to the well-studied Minimum Common String Partition problem, where the goal is to partition the former string into blocks that can be permuted and concatenated to obtain the latter string. \u0000Maximum Duo-Preservation String Mapping is APX-hard. After a series of improvements, Brubach [WABI 2016] showed a polynomial-time $3.25$-approximation algorithm. Our main contribution is that for any $epsilon>0$ there exists a polynomial-time $(2+epsilon)$-approximation algorithm. Similarly to a previous solution by Boria et al. [CPM 2016], our algorithm uses the local search technique. However, this is used only after a certain preliminary greedy procedure, which gives us more structure and makes a more general local search possible. We complement this with a specialised version of the algorithm that achieves $2.67$-approximation in quadratic time.","PeriodicalId":236737,"journal":{"name":"Annual Symposium on Combinatorial Pattern Matching","volume":"2 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2017-02-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"127735309","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 7
Longest Common Extensions with Recompression 带重压缩的最长公共扩展
Annual Symposium on Combinatorial Pattern Matching Pub Date : 2016-11-16 DOI: 10.4230/LIPIcs.CPM.2017.18
T. I.
{"title":"Longest Common Extensions with Recompression","authors":"T. I.","doi":"10.4230/LIPIcs.CPM.2017.18","DOIUrl":"https://doi.org/10.4230/LIPIcs.CPM.2017.18","url":null,"abstract":"Given two positions i and j in a string T of length N, a longest common extension (LCE) query asks for the length of the longest common prefix between suffixes beginning at i and j. A compressed LCE data structure stores T in a compressed form while supporting fast LCE queries. In this article we show that the recompression technique is a powerful tool for compressed LCE data structures. We present a new compressed LCE data structure of size O(z lg (N/z)) that supports LCE queries in O(lg N) time, where z is the size of Lempel-Ziv 77 factorization without self-reference of T. Given T as an uncompressed form, we show how to build our data structure in O(N) time and space. Given T as a grammar compressed form, i.e., a straight-line program of size n generating T, we show how to build our data structure in O(n lg (N/n)) time and O(n + z lg (N/z)) space. Our algorithms are deterministic and always return correct answers.","PeriodicalId":236737,"journal":{"name":"Annual Symposium on Combinatorial Pattern Matching","volume":"1 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2016-11-16","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"128410837","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 30
Computing All Distinct Squares in Linear Time for Integer Alphabets 在线性时间内计算整数字母的所有不同的平方
Annual Symposium on Combinatorial Pattern Matching Pub Date : 2016-10-11 DOI: 10.4230/LIPIcs.CPM.2017.22
H. Bannai, Shunsuke Inenaga, D. Köppl
{"title":"Computing All Distinct Squares in Linear Time for Integer Alphabets","authors":"H. Bannai, Shunsuke Inenaga, D. Köppl","doi":"10.4230/LIPIcs.CPM.2017.22","DOIUrl":"https://doi.org/10.4230/LIPIcs.CPM.2017.22","url":null,"abstract":"Given a string on an integer alphabet, we present an algorithm that computes the set of all distinct squares belonging to this string in time linear to the string length. As an application, we show how to compute the tree topology of the minimal augmented suffix tree in linear time. Asides from that, we elaborate an algorithm computing the longest previous table in a succinct representation using compressed working space.","PeriodicalId":236737,"journal":{"name":"Annual Symposium on Combinatorial Pattern Matching","volume":"98 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2016-10-11","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"122614035","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 21
0
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
确定
请完成安全验证×
相关产品
×
本文献相关产品
联系我们:info@booksci.cn Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。 Copyright © 2023 布克学术 All rights reserved.
京ICP备2023020795号-1
ghs 京公网安备 11010802042870号
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术官方微信