{"title":"Parameterized Algorithms for Matrix Completion With Radius Constraints","authors":"Tomohiro Koana, Vincent Froese, R. Niedermeier","doi":"10.4230/LIPIcs.CPM.2020.20","DOIUrl":"https://doi.org/10.4230/LIPIcs.CPM.2020.20","url":null,"abstract":"Considering matrices with missing entries, we study NP-hard matrix completion problems where the resulting completed matrix shall have limited (local) radius. In the pure radius version, this means that the goal is to fill in the entries such that there exists a 'center string' which has Hamming distance to all matrix rows as small as possible. In stringology, this problem is also known as Closest String with Wildcards. In the local radius version, the requested center string must be one of the rows of the completed matrix. Hermelin and Rozenberg [CPM 2014, TCS 2016] performed parameterized complexity studies for Closest String with Wildcards. We answer one of their open questions, fix a bug concerning a fixed-parameter tractability result in their work, and improve some upper running time bounds. For the local radius case, we reveal a computational complexity dichotomy. In general, our results indicate that, although being NP-hard as well, this variant often allows for faster (fixed-parameter) algorithms.","PeriodicalId":236737,"journal":{"name":"Annual Symposium on Combinatorial Pattern Matching","volume":"81 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2020-02-03","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"116214210","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Chaining with overlaps revisited","authors":"V. Mäkinen, Kristoffer Sahlin","doi":"10.4230/LIPIcs.CPM.2020.25","DOIUrl":"https://doi.org/10.4230/LIPIcs.CPM.2020.25","url":null,"abstract":"Chaining algorithms aim to form a semi-global alignment of two sequences based on a set of anchoring local alignments as input. Depending on the optimization criteria and the exact definition of a chain, there are several $O(n log n)$ time algorithms to solve this problem optimally, where $n$ is the number of input anchors. \u0000In this paper, we focus on a formulation allowing the anchors to overlap in a chain. This formulation was studied by Shibuya and Kurochin (WABI 2003), but their algorithm comes with no proof of correctness. We revisit and modify their algorithm to consider a strict definition of precedence relation on anchors, adding the required derivation to convince on the correctness of the resulting algorithm that runs in $O(n log^2 n)$ time on anchors formed by exact matches. With the more relaxed definition of precedence relation considered by Shibuya and Kurochin or when anchors are non-nested such as matches of uniform length ($k$-mers), the algorithm takes $O(n log n)$ time. \u0000We also establish a connection between chaining with overlaps to the widely studied longest common subsequence (LCS) problem.","PeriodicalId":236737,"journal":{"name":"Annual Symposium on Combinatorial Pattern Matching","volume":"15 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2020-01-19","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"133846537","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Maximal Common Subsequence Algorithms","authors":"Y. Sakai","doi":"10.4230/LIPIcs.CPM.2018.1","DOIUrl":"https://doi.org/10.4230/LIPIcs.CPM.2018.1","url":null,"abstract":"A common subsequence of two strings is maximal, if inserting any character into the subsequence can no longer yield a common subsequence of the two strings. The present article proposes a (sub)linearithmic-time, linear-space algorithm for finding a maximal common subsequence of two strings and also proposes a linear-time algorithm for determining if a common subsequence of two strings is maximal.","PeriodicalId":236737,"journal":{"name":"Annual Symposium on Combinatorial Pattern Matching","volume":"14 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2019-11-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"126152151","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Y. Urabe, Yuto Nakashima, Shunsuke Inenaga, H. Bannai, M. Takeda
{"title":"On the Size of Overlapping Lempel-Ziv and Lyndon Factorizations","authors":"Y. Urabe, Yuto Nakashima, Shunsuke Inenaga, H. Bannai, M. Takeda","doi":"10.4230/LIPIcs.CPM.2019.29","DOIUrl":"https://doi.org/10.4230/LIPIcs.CPM.2019.29","url":null,"abstract":"Lempel-Ziv (LZ) factorization and Lyndon factorization are well-known factorizations of strings. Recently, Karkkainen et al. studied the relation between the sizes of the two factorizations, and showed that the size of the Lyndon factorization is always smaller than twice the size of the non-overlapping LZ factorization [STACS 2017]. In this paper, we consider a similar problem for the overlapping version of the LZ factorization. Since the size of the overlapping LZ factorization is always smaller than the size of the non-overlapping LZ factorization and, in fact, can even be an O(log n) factor smaller, it is not immediately clear whether a similar bound as in previous work would hold. Nevertheless, in this paper, we prove that the size of the Lyndon factorization is always smaller than four times the size of the overlapping LZ factorization.","PeriodicalId":236737,"journal":{"name":"Annual Symposium on Combinatorial Pattern Matching","volume":"9 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2019-06-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"125742855","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Hamming Distance Completeness","authors":"K. Labib, P. Uznański, D. Wolleb-Graf","doi":"10.4230/LIPIcs.CPM.2019.14","DOIUrl":"https://doi.org/10.4230/LIPIcs.CPM.2019.14","url":null,"abstract":"We show, given a binary integer function diamond that is piecewise polynomial, that (+,diamond) vector products are equivalent under one-to-polylog reductions to the computation of the Hamming distance. Examples include the dominance and l_{2p+1} distances for constant p. Our results imply equivalence (up to polylog factors) between the complexity of computing All Pairs Hamming Distance, All Pairs l_{2p+1} Distance and Dominance Matrix Product, and equivalence between Hamming Distance Pattern Matching, l_{2p+1} Pattern Matching and Less-Than Pattern Matching. The resulting algorithms for l_{2p+1} Pattern Matching and All Pairs l_{2p+1}, for 2p+1 = 3,5,7,... are likely to be optimal, given lack of progress in improving upper bounds for Hamming distance in the past 30 years. While reductions between selected pairs of products were presented in the past, our work is the first to generalize them to a general class of functions, showing that a wide class of \"intermediate\" complexity problems are in fact equivalent.","PeriodicalId":236737,"journal":{"name":"Annual Symposium on Combinatorial Pattern Matching","volume":"60 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2019-06-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"127205517","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
H. Bannai, Juha Kärkkäinen, D. Köppl, Marcin Piatkowski
{"title":"Indexing the Bijective BWT","authors":"H. Bannai, Juha Kärkkäinen, D. Köppl, Marcin Piatkowski","doi":"10.4230/LIPIcs.CPM.2019.17","DOIUrl":"https://doi.org/10.4230/LIPIcs.CPM.2019.17","url":null,"abstract":"The Burrows-Wheeler transform (BWT) is a permutation whose applications are prevalent in data compression and text indexing. The bijective BWT is a bijective variant of it that has not yet been studied for text indexing applications. We fill this gap by proposing a self-index built on the bijective BWT . The self-index applies the backward search technique of the FM-index to find a pattern P with O(|P| lg|P|) backward search steps.","PeriodicalId":236737,"journal":{"name":"Annual Symposium on Combinatorial Pattern Matching","volume":"40 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2019-06-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"123124845","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"An Invertible Transform for Efficient String Matching in Labeled Digraphs","authors":"Abhinav Nellore, Austin Nguyen, Reid F. Thompson","doi":"10.4230/LIPIcs.CPM.2021.20","DOIUrl":"https://doi.org/10.4230/LIPIcs.CPM.2021.20","url":null,"abstract":"Let $G = (V, E)$ be a digraph where each vertex is unlabeled, each edge is labeled by a character in some alphabet $Omega$, and any two edges with both the same head and the same tail have different labels. The powerset construction gives a transform of $G$ into a weakly connected digraph $G' = (V', E')$ that enables solving the decision problem of whether there is a walk in $G$ matching an arbitrarily long query string $q$ in time linear in $|q|$ and independent of $|E|$ and $|V|$. We show $G$ is uniquely determined by $G'$ when for every $v_ell in V$, there is some distinct string $s_ell$ on $Omega$ such that $v_ell$ is the origin of a closed walk in $G$ matching $s_ell$, and no other walk in $G$ matches $s_ell$ unless it starts and ends at $v_ell$. We then exploit this invertibility condition to strategically alter any $G$ so its transform $G'$ enables retrieval of all $t$ terminal vertices of walks in the unaltered $G$ matching $q$ in $O(|q| + t log |V|)$ time. We conclude by proposing two defining properties of a class of transforms that includes the Burrows-Wheeler transform and the transform presented here.","PeriodicalId":236737,"journal":{"name":"Annual Symposium on Combinatorial Pattern Matching","volume":"31 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2019-05-09","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"115293386","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Conversion from RLBWT to LZ77","authors":"T. Nishimoto, Yasuo Tabei","doi":"10.4230/LIPIcs.CPM.2019.9","DOIUrl":"https://doi.org/10.4230/LIPIcs.CPM.2019.9","url":null,"abstract":"Converting a compressed format of a string into another compressed format without an explicit decompression is one of the central research topics in string processing. We discuss the problem of converting the run-length Burrows-Wheeler Transform (RLBWT) of a string to Lempel-Ziv 77 (LZ77) phrases of the reversed string. The first results with Policriti and Prezza's conversion algorithm [Algorithmica 2018] were $O(n log r)$ time and $O(r)$ working space for length of the string $n$, number of runs $r$ in the RLBWT, and number of LZ77 phrases $z$. Recent results with Kempa's conversion algorithm [SODA 2019] are $O(n / log n + r log^{9} n + z log^{9} n)$ time and $O(n / log_{sigma} n + r log^{8} n)$ working space for the alphabet size $sigma$ of the RLBWT. In this paper, we present a new conversion algorithm by improving Policriti and Prezza's conversion algorithm where dynamic data structures for general purpose are used. We argue that these dynamic data structures can be replaced and present new data structures for faster conversion. The time and working space of our conversion algorithm with new data structures are $O(n min { log log n, sqrt{frac{log r}{loglog r}} })$ and $O(r)$, respectively.","PeriodicalId":236737,"journal":{"name":"Annual Symposium on Combinatorial Pattern Matching","volume":"51 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2019-02-14","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"121668881","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Mitsuru Funakoshi, Yuto Nakashima, Shunsuke Inenaga, H. Bannai, M. Takeda
{"title":"Faster queries for longest substring palindrome after block edit","authors":"Mitsuru Funakoshi, Yuto Nakashima, Shunsuke Inenaga, H. Bannai, M. Takeda","doi":"10.4230/LIPICS.CPM.2019.27","DOIUrl":"https://doi.org/10.4230/LIPICS.CPM.2019.27","url":null,"abstract":"Palindromes are important objects in strings which have been extensively studied from combinatorial, algorithmic, and bioinformatics points of views. Manacher [J. ACM 1975] proposed a seminal algorithm that computes the longest substring palindromes (LSPals) of a given string in O(n) time, where n is the length of the string. In this paper, we consider the problem of finding the LSPal after the string is edited. We present an algorithm that uses O(n) time and space for preprocessing, and answers the length of the LSPals in O(l + log log n) time, after a substring in T is replaced by a string of arbitrary length l. This outperforms the query algorithm proposed in our previous work [CPM 2018] that uses O(l + log n) time for each query.","PeriodicalId":236737,"journal":{"name":"Annual Symposium on Combinatorial Pattern Matching","volume":"56 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2019-01-30","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"115143511","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Ryo Sugahara, Yuto Nakashima, Shunsuke Inenaga, H. Bannai, M. Takeda
{"title":"Computing runs on a trie","authors":"Ryo Sugahara, Yuto Nakashima, Shunsuke Inenaga, H. Bannai, M. Takeda","doi":"10.4230/LIPIcs.CPM.2019.23","DOIUrl":"https://doi.org/10.4230/LIPIcs.CPM.2019.23","url":null,"abstract":"A maximal repetition, or run, in a string, is a periodically maximal substring whose smallest period is at most half the length of the substring. In this paper, we consider runs that correspond to a path on a trie, or in other words, on a rooted edge-labeled tree where the endpoints of the path must be a descendant/ancestor of the other. For a trie with $n$ edges, we show that the number of runs is less than $n$. We also show an $O(nsqrt{log n}log log n)$ time and $O(n)$ space algorithm for counting and finding the shallower endpoint of all runs. We further show an $O(nsqrt{log n}log^2log n)$ time and $O(n)$ space algorithm for finding both endpoints of all runs.","PeriodicalId":236737,"journal":{"name":"Annual Symposium on Combinatorial Pattern Matching","volume":"-1 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2019-01-30","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"123698923","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}