{"title":"Fast algorithms for window accumulated subsequence matching problem","authors":"Zdenek Tronicek","doi":"10.1007/s00236-026-00523-4","DOIUrl":null,"url":null,"abstract":"<div><p>A subsequence of a string <i>T</i> is any string that can be obtained by removing zero or more symbols from <i>T</i>. The paper deals with the Window Accumulated Subsequence matching Problem (WASP), which is defined as follows: Given two strings, the text <i>T</i> and the pattern <i>P</i>, and a positive integer <i>w</i>, the window size, find the number of size <i>w</i> substrings of <i>T</i> that contain <i>P</i> as a subsequence. Three algorithms for this problem are introduced: a bit-parallel approach, an algorithm preprocessing the pattern, and an algorithm preprocessing the text. The bit-parallel approach outperforms the state-of-the-art algorithm, and the other two algorithms outperform the bit-parallel approach for small alphabets, short patterns, and windows that are not much larger than the pattern. Furthermore, a preprocessing of the text that solves WASP for a fixed window size and each possible pattern of a given size is described. This is beneficial when we are to solve WASP for a single text and multiple patterns, because when the text is preprocessed, a solution is provided promptly.</p></div>","PeriodicalId":7189,"journal":{"name":"Acta Informatica","volume":"63 1","pages":""},"PeriodicalIF":0.5000,"publicationDate":"2026-02-25","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Acta Informatica","FirstCategoryId":"94","ListUrlMain":"https://link.springer.com/article/10.1007/s00236-026-00523-4","RegionNum":4,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q4","JCRName":"COMPUTER SCIENCE, INFORMATION SYSTEMS","Score":null,"Total":0}
引用次数: 0
Abstract
A subsequence of a string T is any string that can be obtained by removing zero or more symbols from T. The paper deals with the Window Accumulated Subsequence matching Problem (WASP), which is defined as follows: Given two strings, the text T and the pattern P, and a positive integer w, the window size, find the number of size w substrings of T that contain P as a subsequence. Three algorithms for this problem are introduced: a bit-parallel approach, an algorithm preprocessing the pattern, and an algorithm preprocessing the text. The bit-parallel approach outperforms the state-of-the-art algorithm, and the other two algorithms outperform the bit-parallel approach for small alphabets, short patterns, and windows that are not much larger than the pattern. Furthermore, a preprocessing of the text that solves WASP for a fixed window size and each possible pattern of a given size is described. This is beneficial when we are to solve WASP for a single text and multiple patterns, because when the text is preprocessed, a solution is provided promptly.
期刊介绍:
Acta Informatica provides international dissemination of articles on formal methods for the design and analysis of programs, computing systems and information structures, as well as related fields of Theoretical Computer Science such as Automata Theory, Logic in Computer Science, and Algorithmics.
Topics of interest include:
• semantics of programming languages
• models and modeling languages for concurrent, distributed, reactive and mobile systems
• models and modeling languages for timed, hybrid and probabilistic systems
• specification, program analysis and verification
• model checking and theorem proving
• modal, temporal, first- and higher-order logics, and their variants
• constraint logic, SAT/SMT-solving techniques
• theoretical aspects of databases, semi-structured data and finite model theory
• theoretical aspects of artificial intelligence, knowledge representation, description logic
• automata theory, formal languages, term and graph rewriting
• game-based models, synthesis
• type theory, typed calculi
• algebraic, coalgebraic and categorical methods
• formal aspects of performance, dependability and reliability analysis
• foundations of information and network security
• parallel, distributed and randomized algorithms
• design and analysis of algorithms
• foundations of network and communication protocols.