Peaker Guo, Seeun William Umboh, Anthony Wirth, Justin Zobel
{"title":"Online Computation of String Net Frequency","authors":"Peaker Guo, Seeun William Umboh, Anthony Wirth, Justin Zobel","doi":"arxiv-2408.00308","DOIUrl":null,"url":null,"abstract":"The net frequency (NF) of a string, of length $m$, in a text, of length $n$,\nis the number of occurrences of the string in the text with unique left and\nright extensions. Recently, Guo et al. [CPM 2024] showed that NF is\ncombinatorially interesting and how two key questions can be computed\nefficiently in the offline setting. First, SINGLE-NF: reporting the NF of a\nquery string in an input text. Second, ALL-NF: reporting an occurrence and the\nNF of each string of positive NF in an input text. For many applications,\nhowever, facilitating these computations in an online manner is highly\ndesirable. We are the first to solve the above two problems in the online\nsetting, and we do so in optimal time, assuming, as is common, a constant-size\nalphabet: SINGLE-NF in $O(m)$ time and ALL-NF in $O(n)$ time. Our results are\nachieved by first designing new and simpler offline algorithms using suffix\ntrees, proving additional properties of NF, and exploiting Ukkonen's online\nsuffix tree construction algorithm and results on implicit node maintenance in\nan implicit suffix tree by Breslauer and Italiano.","PeriodicalId":501525,"journal":{"name":"arXiv - CS - Data Structures and Algorithms","volume":"80 1","pages":""},"PeriodicalIF":0.0000,"publicationDate":"2024-08-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"arXiv - CS - Data Structures and Algorithms","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/arxiv-2408.00308","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 0
Abstract
The net frequency (NF) of a string, of length $m$, in a text, of length $n$,
is the number of occurrences of the string in the text with unique left and
right extensions. Recently, Guo et al. [CPM 2024] showed that NF is
combinatorially interesting and how two key questions can be computed
efficiently in the offline setting. First, SINGLE-NF: reporting the NF of a
query string in an input text. Second, ALL-NF: reporting an occurrence and the
NF of each string of positive NF in an input text. For many applications,
however, facilitating these computations in an online manner is highly
desirable. We are the first to solve the above two problems in the online
setting, and we do so in optimal time, assuming, as is common, a constant-size
alphabet: SINGLE-NF in $O(m)$ time and ALL-NF in $O(n)$ time. Our results are
achieved by first designing new and simpler offline algorithms using suffix
trees, proving additional properties of NF, and exploiting Ukkonen's online
suffix tree construction algorithm and results on implicit node maintenance in
an implicit suffix tree by Breslauer and Italiano.