对k-子序列通用性的编辑距离

IF 0.9 3区计算机科学 Q1 BUSINESS, FINANCE

Journal of Computer and System Sciences Pub Date : 2025-06-16 DOI:10.1016/j.jcss.2025.103681

Joel D. Day , Pamela Fleischmann , Maria Kosche , Tore Koß , Florin Manea , Stefan Siemer

{"title":"对k-子序列通用性的编辑距离","authors":"Joel D. Day , Pamela Fleischmann , Maria Kosche , Tore Koß , Florin Manea , Stefan Siemer","doi":"10.1016/j.jcss.2025.103681","DOIUrl":null,"url":null,"abstract":"<div><div>A word u is a subsequence of another word w if u is obtained from w by deleting some of its letters. In the 1970s, Simon defined the relation <math><msub><mrow><mo>∼</mo></mrow><mrow><mi>k</mi></mrow></msub></math> (called now Simon-Congruence) as follows: two words having the same set of subsequences of length k are <math><msub><mrow><mo>∼</mo></mrow><mrow><mi>k</mi></mrow></msub></math>-congruent. It is thus natural to ask, for non k-equivalent words w and u, what is the minimal number of edit operations that we need to perform on w to obtain a word which is <math><msub><mrow><mo>∼</mo></mrow><mrow><mi>k</mi></mrow></msub></math>-equivalent to u. Here, we consider this problem in a specific setting: when u is a k-subsequence universal word. A word u with <math><mi>a</mi><mi>l</mi><mi>p</mi><mi>h</mi><mo>(</mo><mi>u</mi><mo>)</mo><mo>=</mo><mi>Σ</mi></math> is called k-subsequence universal if the set of length-k subsequences of u contains all possible words of length k over Σ. As such, our results are a series of efficient algorithms computing the edit distance from w to the language of k-subsequence universal words.</div></div>","PeriodicalId":50224,"journal":{"name":"Journal of Computer and System Sciences","volume":"154 ","pages":"Article 103681"},"PeriodicalIF":0.9000,"publicationDate":"2025-06-16","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":"{\"title\":\"The edit distance to k-subsequence universality\",\"authors\":\"Joel D. Day , Pamela Fleischmann , Maria Kosche , Tore Koß , Florin Manea , Stefan Siemer\",\"doi\":\"10.1016/j.jcss.2025.103681\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"<div><div>A word u is a subsequence of another word w if u is obtained from w by deleting some of its letters. In the 1970s, Simon defined the relation <math><msub><mrow><mo>∼</mo></mrow><mrow><mi>k</mi></mrow></msub></math> (called now Simon-Congruence) as follows: two words having the same set of subsequences of length k are <math><msub><mrow><mo>∼</mo></mrow><mrow><mi>k</mi></mrow></msub></math>-congruent. It is thus natural to ask, for non k-equivalent words w and u, what is the minimal number of edit operations that we need to perform on w to obtain a word which is <math><msub><mrow><mo>∼</mo></mrow><mrow><mi>k</mi></mrow></msub></math>-equivalent to u. Here, we consider this problem in a specific setting: when u is a k-subsequence universal word. A word u with <math><mi>a</mi><mi>l</mi><mi>p</mi><mi>h</mi><mo>(</mo><mi>u</mi><mo>)</mo><mo>=</mo><mi>Σ</mi></math> is called k-subsequence universal if the set of length-k subsequences of u contains all possible words of length k over Σ. As such, our results are a series of efficient algorithms computing the edit distance from w to the language of k-subsequence universal words.</div></div>\",\"PeriodicalId\":50224,\"journal\":{\"name\":\"Journal of Computer and System Sciences\",\"volume\":\"154 \",\"pages\":\"Article 103681\"},\"PeriodicalIF\":0.9000,\"publicationDate\":\"2025-06-16\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"0\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"Journal of Computer and System Sciences\",\"FirstCategoryId\":\"94\",\"ListUrlMain\":\"https://www.sciencedirect.com/science/article/pii/S0022000025000637\",\"RegionNum\":3,\"RegionCategory\":\"计算机科学\",\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"Q1\",\"JCRName\":\"BUSINESS, FINANCE\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"Journal of Computer and System Sciences","FirstCategoryId":"94","ListUrlMain":"https://www.sciencedirect.com/science/article/pii/S0022000025000637","RegionNum":3,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q1","JCRName":"BUSINESS, FINANCE","Score":null,"Total":0}

引用次数: 0

摘要

一个单词u是另一个单词w的子序列，如果u是通过删除w中的一些字母而得到的。在20世纪70年代，Simon定义了关系~ k（现在称为Simon-同余）如下：具有相同长度为k的子序列集的两个单词是~ k同余的。因此，很自然地要问，对于非k-等价的单词w和u，我们需要对w执行多少次编辑操作才能获得一个与u - k-等价的单词。这里，我们在一个特定的设置中考虑这个问题：当u是一个k-子序列全称词时。如果u的长度为k- k的子序列集合包含了长度为k / Σ的所有可能的单词，那么一个单词u (u)=Σ被称为k-子序列全称。因此，我们的结果是一系列有效的算法，计算从w到k-子序列通用词的语言的编辑距离。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

查看原文本刊更多论文

The edit distance to k-subsequence universality

A word u is a subsequence of another word w if u is obtained from w by deleting some of its letters. In the 1970s, Simon defined the relation

\sim_{k}

(called now Simon-Congruence) as follows: two words having the same set of subsequences of length k are

\sim_{k}

-congruent. It is thus natural to ask, for non k-equivalent words w and u, what is the minimal number of edit operations that we need to perform on w to obtain a word which is

\sim_{k}

-equivalent to u. Here, we consider this problem in a specific setting: when u is a k-subsequence universal word. A word u with

a l p h (u) = Σ

is called k-subsequence universal if the set of length-k subsequences of u contains all possible words of length k over Σ. As such, our results are a series of efficient algorithms computing the edit distance from w to the language of k-subsequence universal words.

求助全文

通过发布文献求助，成功后即可免费获取论文全文。去求助

来源期刊

Journal of Computer and System Sciences 工程技术-计算机：理论方法

CiteScore

3.70

自引率

0.00%

发文量

审稿时长

68 days

期刊介绍： The Journal of Computer and System Sciences publishes original research papers in computer science and related subjects in system science, with attention to the relevant mathematical theory. Applications-oriented papers may also be accepted and they are expected to contain deep analytic evaluation of the proposed solutions. Research areas include traditional subjects such as: • Theory of algorithms and computability • Formal languages • Automata theory Contemporary subjects such as: • Complexity theory • Algorithmic Complexity • Parallel & distributed computing • Computer networks • Neural networks • Computational learning theory • Database theory & practice • Computer modeling of complex systems • Security and Privacy.