Some theory and practice of greedy off-line textual substitution

Proceedings DCC '98 Data Compression Conference (Cat. No.98TB100225) Pub Date : 1998-03-30 DOI:10.1109/DCC.1998.672138

A. Apostolico, S. Lonardi

引用次数: 40

Abstract

Greedy off-line textual substitution refers to the following steepest descent approach to compression or structural inference. Given a long text string x, a substring w is identified such that replacing all instances of w in x except one by a suitable pair of pointers yields the highest possible contraction of x; the process is then repeated on the contracted text string, until substrings capable of producing contractions can no longer be found. This paper examines the computational issues and performance resulting from implementations of this paradigm in preliminary applications and experiments. Apart from intrinsic interest, these methods may find use in the compression of massively disseminated data, and lend themselves to efficient parallel implementation, perhaps on dedicated architectures.

查看原文本刊更多论文

贪婪离线文本替换的理论与实践

贪婪离线文本替换指的是以下最快速下降的压缩或结构推理方法。给定一个长文本字符串x，一个子字符串w被识别出来，这样就可以用一对合适的指针替换x中除一个之外的所有w的实例，从而产生x的最大可能收缩;然后在压缩文本字符串上重复这个过程，直到再也找不到能够产生压缩的子字符串。本文研究了在初步应用和实验中实现这种范式所产生的计算问题和性能。除了固有的兴趣之外，这些方法还可以用于压缩大规模传播的数据，并且可以在专用架构上实现高效的并行实现。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

求助全文

约1分钟内获得全文求助全文

来源期刊

Proceedings DCC '98 Data Compression Conference (Cat. No.98TB100225)

自引率

0.00%

发文量