An overhead reduction technique for mega-state compression schemes

Proceedings DCC '97. Data Compression Conference Pub Date : 1997-03-25 DOI:10.1109/DCC.1997.582061

A. Bookstein, S. T. Klein, T. Raita

引用次数: 5

Abstract

Many of the most effective compression methods involve complicated models. Unfortunately, as model complexity increases, so does the cost of storing the model itself. This paper examines a method to reduce the amount of storage needed to represent a Markov model with an extended alphabet, by applying a clustering scheme that brings together similar states. Experiments run on a variety of large natural language texts show that much of the overhead of storing the model can be saved at the cost of a very small loss of compression efficiency.

查看原文本刊更多论文

一种用于大状态压缩方案的开销减少技术

许多最有效的压缩方法都涉及复杂的模型。不幸的是，随着模型复杂性的增加，存储模型本身的成本也在增加。本文研究了一种方法，通过应用将相似状态聚集在一起的聚类方案，减少用扩展字母表表示马尔可夫模型所需的存储量。在各种大型自然语言文本上运行的实验表明，以很小的压缩效率损失为代价，可以节省存储模型的大部分开销。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

求助全文

约1分钟内获得全文求助全文

来源期刊

Proceedings DCC '97. Data Compression Conference

自引率

0.00%

发文量