{"title":"The longest common subsequence problem for small alphabets in the word RAM model","authors":"Rodrigo Alexander Castro Campos","doi":"10.1016/j.ipl.2025.106579","DOIUrl":null,"url":null,"abstract":"<div><div>Given two strings of lengths <em>m</em> and <em>n</em>, with <span><math><mi>m</mi><mo>≤</mo><mi>n</mi></math></span>, the longest common subsequence problem consists of computing a common subsequence of maximum length by deleting symbols from both strings. While the <span><math><mi>O</mi><mo>(</mo><mi>m</mi><mi>n</mi><mo>)</mo></math></span> algorithm devised in 1974 is optimal in the most general setting, algorithms that depend on parameters other than <em>m</em> and <em>n</em> have been proposed since then. In the word RAM model, let <em>w</em> be the word size, <em>s</em> be the alphabet size, <em>d</em> be the number of dominant symbol matches between the strings, and <em>p</em> be the length of the longest common subsequence. Fast algorithms for this problem have complexities <span><math><mi>O</mi><mo>(</mo><mi>m</mi><mi>n</mi><mo>/</mo><mi>log</mi><mo></mo><mi>n</mi><mo>)</mo></math></span>, <span><math><mi>O</mi><mo>(</mo><mi>m</mi><mi>n</mi><mo>/</mo><mi>w</mi><mo>)</mo></math></span>, <span><math><mi>O</mi><mo>(</mo><mi>n</mi><mi>s</mi><mo>+</mo><mi>min</mi><mo></mo><mo>(</mo><mi>p</mi><mo>(</mo><mi>n</mi><mo>−</mo><mi>p</mi><mo>)</mo><mo>,</mo><mi>p</mi><mi>m</mi><mo>)</mo><mo>)</mo></math></span>, <span><math><mi>O</mi><mo>(</mo><mi>n</mi><mi>log</mi><mo></mo><mi>s</mi><mo>+</mo><mi>d</mi><mi>log</mi><mo></mo><mi>log</mi><mo></mo><mi>min</mi><mo></mo><mo>(</mo><mi>d</mi><mo>,</mo><mi>m</mi><mi>n</mi><mo>/</mo><mi>d</mi><mo>)</mo><mo>)</mo></math></span>, <span><math><mi>O</mi><mo>(</mo><mi>n</mi><mi>s</mi><mo>+</mo><mi>min</mi><mo></mo><mo>(</mo><mi>d</mi><mi>s</mi><mo>,</mo><mi>p</mi><mi>m</mi><mo>)</mo><mo>)</mo></math></span>, and <span><math><mi>O</mi><mo>(</mo><mi>n</mi><mi>s</mi><mo>+</mo><mi>s</mi><msup><mrow><mo>!</mo></mrow><mrow><mn>2</mn></mrow></msup><mi>s</mi><mo>+</mo><mi>d</mi><mi>log</mi><mo></mo><mi>s</mi><mo>)</mo></math></span>. In this work, we present an <span><math><mi>O</mi><mo>(</mo><mi>n</mi><mo>(</mo><mi>s</mi><mo>+</mo><msup><mrow><mi>log</mi></mrow><mrow><mo>⁎</mo></mrow></msup><mo></mo><mi>n</mi><mo>)</mo><mo>+</mo><mi>min</mi><mo></mo><mo>(</mo><mi>d</mi><mi>log</mi><mo></mo><mi>s</mi><mo>,</mo><mi>p</mi><mi>m</mi><mo>)</mo><mo>)</mo></math></span> algorithm when <span><math><mi>s</mi><mo>∈</mo><mi>O</mi><mo>(</mo><mi>w</mi><mo>)</mo></math></span>, and also an <span><math><mi>O</mi><mo>(</mo><mi>n</mi><mo>(</mo><mi>s</mi><mo>+</mo><msup><mrow><mi>log</mi></mrow><mrow><mo>⁎</mo></mrow></msup><mo></mo><mi>n</mi><mo>)</mo><mo>+</mo><mi>d</mi><mo>)</mo></math></span> algorithm when <span><math><mi>s</mi><mo>≤</mo><mi>w</mi></math></span> which uses bitwise instructions that became recently available in modern processors.</div></div>","PeriodicalId":56290,"journal":{"name":"Information Processing Letters","volume":"190 ","pages":"Article 106579"},"PeriodicalIF":0.7000,"publicationDate":"2025-04-10","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Information Processing Letters","FirstCategoryId":"94","ListUrlMain":"https://www.sciencedirect.com/science/article/pii/S0020019025000237","RegionNum":4,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q4","JCRName":"COMPUTER SCIENCE, INFORMATION SYSTEMS","Score":null,"Total":0}
引用次数: 0
Abstract
Given two strings of lengths m and n, with , the longest common subsequence problem consists of computing a common subsequence of maximum length by deleting symbols from both strings. While the algorithm devised in 1974 is optimal in the most general setting, algorithms that depend on parameters other than m and n have been proposed since then. In the word RAM model, let w be the word size, s be the alphabet size, d be the number of dominant symbol matches between the strings, and p be the length of the longest common subsequence. Fast algorithms for this problem have complexities , , , , , and . In this work, we present an algorithm when , and also an algorithm when which uses bitwise instructions that became recently available in modern processors.
期刊介绍:
Information Processing Letters invites submission of original research articles that focus on fundamental aspects of information processing and computing. This naturally includes work in the broadly understood field of theoretical computer science; although papers in all areas of scientific inquiry will be given consideration, provided that they describe research contributions credibly motivated by applications to computing and involve rigorous methodology. High quality experimental papers that address topics of sufficiently broad interest may also be considered.
Since its inception in 1971, Information Processing Letters has served as a forum for timely dissemination of short, concise and focused research contributions. Continuing with this tradition, and to expedite the reviewing process, manuscripts are generally limited in length to nine pages when they appear in print.