New string attractor-based complexities for infinite words

IF 0.9 2区 数学 Q2 MATHEMATICS
Julien Cassaigne , France Gheeraert , Antonio Restivo , Giuseppe Romana , Marinella Sciortino , Manon Stipulanti
{"title":"New string attractor-based complexities for infinite words","authors":"Julien Cassaigne ,&nbsp;France Gheeraert ,&nbsp;Antonio Restivo ,&nbsp;Giuseppe Romana ,&nbsp;Marinella Sciortino ,&nbsp;Manon Stipulanti","doi":"10.1016/j.jcta.2024.105936","DOIUrl":null,"url":null,"abstract":"<div><p>A <em>string attractor</em> is a set of positions in a word such that each distinct factor has an occurrence crossing a position from the set. This definition comes from the data compression field, where the size <span><math><msup><mrow><mi>γ</mi></mrow><mrow><mo>⁎</mo></mrow></msup></math></span> of a smallest string attractor represents a lower bound for the output size of a large family of string compressors exploiting repetitions in words, including BWT-based and LZ-based compressors. For finite words, the combinatorial properties of string attractors have been studied in 2021 by Mantaci et al.. Later, Schaeffer and Shallit introduced the <em>string attractor profile function</em>, a complexity function that evaluates for each <span><math><mi>n</mi><mo>&gt;</mo><mn>0</mn></math></span> the size <span><math><msup><mrow><mi>γ</mi></mrow><mrow><mo>⁎</mo></mrow></msup></math></span> of the length-<em>n</em> prefix of a one-sided infinite word.</p><p>A natural development of the research on the topic is to link string attractors with other classical notions of repetitiveness in combinatorics on words. Our contribution in this sense is threefold. First, we explore the relation between the string attractor profile function and other well-known combinatorial complexity functions in the context of infinite words, such as the factor complexity and the property of recurrence. Moreover, we study its asymptotic growth in the case of purely morphic words and obtain a complete description in the binary case. Second, we introduce two new string attractor-based complexity functions, in which the structure and the distribution of positions in a string attractor are taken into account, and we study their combinatorial properties. We also show that these measures provide a finer classification of some infinite families of words, namely the Sturmian and quasi-Sturmian words. Third, we explicitly give the three complexities for some specific morphic words called <em>k</em>-bonacci words.</p><p>A preliminary version of some results presented in this paper can be found in [Restivo, Romana, Sciortino, <em>String Attractors and Infinite Words</em>, LATIN 2022].</p></div>","PeriodicalId":50230,"journal":{"name":"Journal of Combinatorial Theory Series A","volume":"208 ","pages":"Article 105936"},"PeriodicalIF":0.9000,"publicationDate":"2024-07-18","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.sciencedirect.com/science/article/pii/S009731652400075X/pdfft?md5=0182f56d4d92ee1616cf2c224e889d2f&pid=1-s2.0-S009731652400075X-main.pdf","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Journal of Combinatorial Theory Series A","FirstCategoryId":"100","ListUrlMain":"https://www.sciencedirect.com/science/article/pii/S009731652400075X","RegionNum":2,"RegionCategory":"数学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q2","JCRName":"MATHEMATICS","Score":null,"Total":0}
引用次数: 0

Abstract

A string attractor is a set of positions in a word such that each distinct factor has an occurrence crossing a position from the set. This definition comes from the data compression field, where the size γ of a smallest string attractor represents a lower bound for the output size of a large family of string compressors exploiting repetitions in words, including BWT-based and LZ-based compressors. For finite words, the combinatorial properties of string attractors have been studied in 2021 by Mantaci et al.. Later, Schaeffer and Shallit introduced the string attractor profile function, a complexity function that evaluates for each n>0 the size γ of the length-n prefix of a one-sided infinite word.

A natural development of the research on the topic is to link string attractors with other classical notions of repetitiveness in combinatorics on words. Our contribution in this sense is threefold. First, we explore the relation between the string attractor profile function and other well-known combinatorial complexity functions in the context of infinite words, such as the factor complexity and the property of recurrence. Moreover, we study its asymptotic growth in the case of purely morphic words and obtain a complete description in the binary case. Second, we introduce two new string attractor-based complexity functions, in which the structure and the distribution of positions in a string attractor are taken into account, and we study their combinatorial properties. We also show that these measures provide a finer classification of some infinite families of words, namely the Sturmian and quasi-Sturmian words. Third, we explicitly give the three complexities for some specific morphic words called k-bonacci words.

A preliminary version of some results presented in this paper can be found in [Restivo, Romana, Sciortino, String Attractors and Infinite Words, LATIN 2022].

基于新字符串吸引子的无限词复杂性
字符串吸引子是单词中的一组位置集合,其中每个不同的因子都有一次与该集合中的一个位置交叉。这个定义来自数据压缩领域,最小字符串吸引子的大小γ⁎代表了一大系列利用词的重复性的字符串压缩器(包括基于 BWT 和 LZ 的压缩器)输出大小的下限。对于有限词,Mantaci 等人在 2021 年研究了字符串吸引子的组合特性。后来,Schaeffer 和 Shallit 引入了字符串吸引子轮廓函数,这是一个复杂度函数,用于评估每个 n>0 单边无限词的长度-n 前缀的大小 γ⁎。在这个意义上,我们的贡献有三方面。首先,我们探讨了弦吸引子轮廓函数与无限词背景下其他著名组合复杂性函数之间的关系,如因子复杂性和递推性质。此外,我们还研究了它在纯形态词情况下的渐进增长,并获得了二元情况下的完整描述。其次,我们引入了两个新的基于字符串吸引子的复杂度函数,其中考虑了字符串吸引子的结构和位置分布,并研究了它们的组合特性。我们还证明,这些度量提供了一些无限词族的更精细分类,即 Sturmian 词和准 Sturmian 词。第三,我们明确给出了一些被称为 k-bonacci 词的特定形态词的三个复杂性。本文中一些结果的初步版本见 [Restivo, Romana, Sciortino, String Attractors and Infinite Words, LATIN 2022]。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
求助全文
约1分钟内获得全文 求助全文
来源期刊
CiteScore
2.90
自引率
9.10%
发文量
94
审稿时长
12 months
期刊介绍: The Journal of Combinatorial Theory publishes original mathematical research concerned with theoretical and physical aspects of the study of finite and discrete structures in all branches of science. Series A is concerned primarily with structures, designs, and applications of combinatorics and is a valuable tool for mathematicians and computer scientists.
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
确定
请完成安全验证×
copy
已复制链接
快去分享给好友吧!
我知道了
右上角分享
点击右上角分享
0
联系我们:info@booksci.cn Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。 Copyright © 2023 布克学术 All rights reserved.
京ICP备2023020795号-1
ghs 京公网安备 11010802042870号
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术官方微信