Annual Symposium on Combinatorial Pattern Matching最新文献_第2页

MONI can find k-MEMs MONI可以找到k-MEMs

Annual Symposium on Combinatorial Pattern Matching Pub Date : 2022-02-10 DOI: 10.4230/LIPIcs.CPM.2023.26

T. Gagie

引用次数: 3

The Normalized Edit Distance with Uniform Operation Costs is a Metric 具有统一操作成本的归一化编辑距离是一个度量

Annual Symposium on Combinatorial Pattern Matching Pub Date : 2022-01-16 DOI: 10.4230/LIPIcs.CPM.2022.17

D. Fisman, Joshua Grogin, Oded Margalit, Gera Weiss

引用次数: 3

Arbitrary-length analogs to de Bruijn sequences 与德布鲁因序列类似的任意长度序列

Annual Symposium on Combinatorial Pattern Matching Pub Date : 2021-08-17 DOI: 10.4230/LIPIcs.CPM.2022.9

Abhinav Nellore, Rachel A. Ward

引用次数: 1

Ranking Bracelets in Polynomial Time 多项式时间排序手镯

Annual Symposium on Combinatorial Pattern Matching Pub Date : 2021-04-09 DOI: 10.4230/LIPIcs.CPM.2021.4

Duncan Adamson, Argyrios Deligkas, V. Gusev, I. Potapov

引用次数: 8

A Linear Time Algorithm for Constructing Hierarchical Overlap Graphs 一种构造分层重叠图的线性时间算法

Annual Symposium on Combinatorial Pattern Matching Pub Date : 2021-02-25 DOI: 10.4230/LIPIcs.CPM.2021.22

Sangsoo Park, Sung Gwan Park, Bastien Cazaux, Kunsoo Park, Eric Rivals

引用次数: 5

Revisiting the Parameterized Complexity of Maximum-Duo Preservation String Mapping 重新审视最大二保存字符串映射的参数化复杂度

Annual Symposium on Combinatorial Pattern Matching Pub Date : 2020-12-01 DOI: 10.4230/LIPIcs.CPM.2017.11

Christian Komusiewicz, Mateus de Oliveira Oliveira, M. Zehavi

引用次数: 4

AWLCO: All-Window Length Co-Occurrence AWLCO:全窗口长度共现

Annual Symposium on Combinatorial Pattern Matching Pub Date : 2020-11-29 DOI: 10.4230/LIPIcs.CPM.2021.24

Joshua Sobel, Noah Bertram, C. Ding, F. Nargesian, D. Gildea

{"title":"AWLCO: All-Window Length Co-Occurrence","authors":"Joshua Sobel, Noah Bertram, C. Ding, F. Nargesian, D. Gildea","doi":"10.4230/LIPIcs.CPM.2021.24","DOIUrl":"https://doi.org/10.4230/LIPIcs.CPM.2021.24","url":null,"abstract":"Analyzing patterns in a sequence of events has applications in text analysis, computer programming, and genomics research. In this paper, we consider the all-window-length analysis model which analyzes a sequence of events with respect to windows of all lengths. We study the exact co-occurrence counting problem for the all-window-length analysis model. Our first algorithm is an offline algorithm that counts all-window-length co-occurrences by performing multiple passes over a sequence and computing single-window-length co-occurrences. This algorithm has the time complexity $O(n)$ for each window length and thus a total complexity of $O(n^2)$ and the space complexity $O(|I|)$ for a sequence of size n and an itemset of size $|I|$. We propose AWLCO, an online algorithm that computes all-window-length co-occurrences in a single pass with the expected time complexity of $O(n)$ and space complexity of $O( sqrt{ n|I| })$. Following this, we generalize our use case to patterns in which we propose an algorithm that computes all-window-length co-occurrence with expected time complexity $O(n|I|)$ and space complexity $O( sqrt{n|I|} + e_{max}|I|)$, where $e_{max}$ is the length of the largest pattern.","PeriodicalId":236737,"journal":{"name":"Annual Symposium on Combinatorial Pattern Matching","volume":"66 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2020-11-29","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"124608711","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 1

The Longest Run Subsequence Problem: Further Complexity Results 最长运行子序列问题:进一步的复杂性结果

Annual Symposium on Combinatorial Pattern Matching Pub Date : 2020-11-16 DOI: 10.4230/LIPIcs.CPM.2021.14

R. Dondi, F. Sikora

引用次数: 2

String Sanitization Under Edit Distance: Improved and Generalized 编辑距离下的字符串消毒:改进与推广

Annual Symposium on Combinatorial Pattern Matching Pub Date : 2020-07-16 DOI: 10.4230/LIPIcs.CPM.2021.19

Takuya Mieno, S. Pissis, L. Stougie, Michelle Sweering

{"title":"String Sanitization Under Edit Distance: Improved and Generalized","authors":"Takuya Mieno, S. Pissis, L. Stougie, Michelle Sweering","doi":"10.4230/LIPIcs.CPM.2021.19","DOIUrl":"https://doi.org/10.4230/LIPIcs.CPM.2021.19","url":null,"abstract":"Let $W$ be a string of length $n$ over an alphabet $Sigma$, $k$ be a positive integer, and $mathcal{S}$ be a set of length-$k$ substrings of $W$. The ETFS problem asks us to construct a string $X_{mathrm{ED}}$ such that: (i) no string of $mathcal{S}$ occurs in $X_{mathrm{ED}}$; (ii) the order of all other length-$k$ substrings over $Sigma$ is the same in $W$ and in $X_{mathrm{ED}}$; and (iii) $X_{mathrm{ED}}$ has minimal edit distance to $W$. When $W$ represents an individual's data and $mathcal{S}$ represents a set of confidential patterns, the ETFS problem asks for transforming $W$ to preserve its privacy and its utility [Bernardini et al., ECML PKDD 2019]. \u0000ETFS can be solved in $mathcal{O}(n^2k)$ time [Bernardini et al., CPM 2020]. The same paper shows that ETFS cannot be solved in $mathcal{O}(n^{2-delta})$ time, for any $delta>0$, unless the Strong Exponential Time Hypothesis (SETH) is false. Our main results can be summarized as follows: (i) an $mathcal{O}(n^2log^2k)$-time algorithm to solve ETFS; and (ii) an $mathcal{O}(n^2log^2n)$-time algorithm to solve AETFS, a generalization of ETFS in which the elements of $mathcal{S}$ can have arbitrary lengths. Our algorithms are thus optimal up to polylogarithmic factors, unless SETH fails. Let us also stress that our algorithms work under edit distance with arbitrary weights at no extra cost. As a bonus, we show how to modify some known techniques, which speed up the standard edit distance computation, to be applied to our problems. Beyond string sanitization, our techniques may inspire solutions to other problems related to regular expressions or context-free grammars.","PeriodicalId":236737,"journal":{"name":"Annual Symposium on Combinatorial Pattern Matching","volume":"26 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2020-07-16","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"124765700","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 1

String Sanitization Under Edit Distance 编辑距离下的字符串处理

Annual Symposium on Combinatorial Pattern Matching Pub Date : 2020-06-09 DOI: 10.4230/LIPIcs.CPM.2020.7

G. Bernardini, Huiping Chen, G. Loukides, N. Pisanti, S. Pissis, L. Stougie, Michelle Sweering

{"title":"String Sanitization Under Edit Distance","authors":"G. Bernardini, Huiping Chen, G. Loukides, N. Pisanti, S. Pissis, L. Stougie, Michelle Sweering","doi":"10.4230/LIPIcs.CPM.2020.7","DOIUrl":"https://doi.org/10.4230/LIPIcs.CPM.2020.7","url":null,"abstract":"textabstractLet W be a string of length n over an alphabet Σ, k be a positive integer, and be a set of length-k substrings of W. The ETFS problem asks us to construct a string X_{ED} such that: (i) no string of occurs in X_{ED}; (ii) the order of all other length-k substrings over Σ is the same in W and in X_{ED}; and (iii) X_{ED} has minimal edit distance to W. When W represents an individual’s data and represents a set of confidential substrings, algorithms solving ETFS can be applied for utility-preserving string sanitization [Bernardini et al., ECML PKDD 2019]. Our first result here is an algorithm to solve ETFS in (kn²) time, which improves on the state of the art [Bernardini et al., arXiv 2019] by a factor of |Σ|. Our algorithm is based on a non-trivial modification of the classic dynamic programming algorithm for computing the edit distance between two strings. Notably, we also show that ETFS cannot be solved in (n^{2-δ}) time, for any δ>0, unless the strong exponential time hypothesis is false. To achieve this, we reduce the edit distance problem, which is known to admit the same conditional lower bound [Bringmann and Kunnemann, FOCS 2015], to ETFS.","PeriodicalId":236737,"journal":{"name":"Annual Symposium on Combinatorial Pattern Matching","volume":"41 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2020-06-09","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"116303767","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 10