The longest common subsequence problem for small alphabets in the word RAM model

IF 0.6 4区计算机科学 Q4 COMPUTER SCIENCE, INFORMATION SYSTEMS

Information Processing Letters Pub Date : 2025-04-10 DOI:10.1016/j.ipl.2025.106579

Rodrigo Alexander Castro Campos

引用次数: 0

Abstract

Given two strings of lengths m and n, with

m \leq n

, the longest common subsequence problem consists of computing a common subsequence of maximum length by deleting symbols from both strings. While the

O (m n)

algorithm devised in 1974 is optimal in the most general setting, algorithms that depend on parameters other than m and n have been proposed since then. In the word RAM model, let w be the word size, s be the alphabet size, d be the number of dominant symbol matches between the strings, and p be the length of the longest common subsequence. Fast algorithms for this problem have complexities

O (m n / \log n)

O (m n / w)

O (n s + \min (p (n - p), p m))

O (n \log s + d \log \log \min (d, m n / d))

O (n s + \min (d s, p m))

, and

O (n s + s!^{2} s + d \log s)

. In this work, we present an

O (n (s + \log^{⁎} n) + \min (d \log s, p m))

algorithm when

s \in O (w)

, and also an

O (n (s + \log^{⁎} n) + d)

algorithm when

s \leq w

which uses bitwise instructions that became recently available in modern processors.

查看原文本刊更多论文

单词RAM模型中小字母的最长公共子序列问题

给定两个长度为m和n的字符串，且m≤n，最长公共子序列问题包括通过从两个字符串中删除符号来计算最大长度的公共子序列。虽然1974年设计的O（mn）算法在大多数一般情况下是最优的，但从那时起，已经提出了依赖于m和n以外参数的算法。在单词RAM模型中，设w为单词大小，s为字母表大小，d为字符串之间的主导符号匹配数，p为最长公共子序列的长度。该问题的快速算法的复杂度为O（mn/log (n)）、O（mn/w）、O（ns+min (p(n−p),pm)）、O(nlog （s) + log (d,mn/d)）、O（ns+min (ds,pm)）和O（ns+s!2s+dlog (s)）。在这项工作中，我们提出了当s∈O(w)时的O(n（s+log n)+min (dlog n,pm)）算法，以及当s≤w时的O（n(s+log n)+d）算法，该算法使用现代处理器中最近可用的位指令。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

求助全文

约1分钟内获得全文求助全文

来源期刊

Information Processing Letters 工程技术-计算机：信息系统

CiteScore

1.80

自引率

0.00%

发文量

审稿时长

7.3 months

期刊介绍： Information Processing Letters invites submission of original research articles that focus on fundamental aspects of information processing and computing. This naturally includes work in the broadly understood field of theoretical computer science; although papers in all areas of scientific inquiry will be given consideration, provided that they describe research contributions credibly motivated by applications to computing and involve rigorous methodology. High quality experimental papers that address topics of sufficiently broad interest may also be considered. Since its inception in 1971, Information Processing Letters has served as a forum for timely dissemination of short, concise and focused research contributions. Continuing with this tradition, and to expedite the reviewing process, manuscripts are generally limited in length to nine pages when they appear in print.