Textual image compression

I. Witten, T. Bell, M. Harrison, Mark L. James, Alistair Moffat
{"title":"Textual image compression","authors":"I. Witten, T. Bell, M. Harrison, Mark L. James, Alistair Moffat","doi":"10.1109/DCC.1992.227477","DOIUrl":null,"url":null,"abstract":"The authors describe a method for lossless compression of images that contain predominantly typed or typeset text-they call these textual images. An increasingly popular application is document archiving, where documents are scanned by a computer and stored electronically for later retrieval. Their project was motivated by such an application: Trinity College in Dublin, Ireland, are archiving their 1872 printed library catalogues onto disk, and in order to preserve the exact form of the original document, pages are being stored as scanned images rather than being converted to text. The test images are taken from this catalogue. These typeset documents have a rather old-fashioned look, and contain a wide variety of symbols from several different typefaces-the five test images used contain text in English, Flemish, Latin and Greek, and include italics and small capitals as well as roman letters. The catalogue also contains Hebrew, Syriac, and Russian text.<<ETX>>","PeriodicalId":170269,"journal":{"name":"Data Compression Conference, 1992.","volume":"5 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"1992-03-24","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"11","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Data Compression Conference, 1992.","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/DCC.1992.227477","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 11

Abstract

The authors describe a method for lossless compression of images that contain predominantly typed or typeset text-they call these textual images. An increasingly popular application is document archiving, where documents are scanned by a computer and stored electronically for later retrieval. Their project was motivated by such an application: Trinity College in Dublin, Ireland, are archiving their 1872 printed library catalogues onto disk, and in order to preserve the exact form of the original document, pages are being stored as scanned images rather than being converted to text. The test images are taken from this catalogue. These typeset documents have a rather old-fashioned look, and contain a wide variety of symbols from several different typefaces-the five test images used contain text in English, Flemish, Latin and Greek, and include italics and small capitals as well as roman letters. The catalogue also contains Hebrew, Syriac, and Russian text.<>
文本图像压缩
作者描述了一种无损压缩主要包含打字或排版文本的图像的方法,他们称之为文本图像。一个日益流行的应用是文档归档,其中文档由计算机扫描并以电子方式存储以供以后检索。他们的项目是由这样一个应用程序激发的:爱尔兰都柏林的三一学院正在将他们1872年印刷的图书馆目录存档到磁盘上,为了保持原始文件的确切形式,页面被存储为扫描图像,而不是转换为文本。测试图像取自本目录。这些排版文档具有相当老式的外观,并且包含来自几种不同字体的各种符号—所使用的五个测试图像包含英语、佛兰德语、拉丁语和希腊语的文本,并且包括斜体和小大写以及罗马字母。该目录还包含希伯来语、叙利亚语和俄语文本。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
求助全文
约1分钟内获得全文 求助全文
来源期刊
自引率
0.00%
发文量
0
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
确定
请完成安全验证×
copy
已复制链接
快去分享给好友吧!
我知道了
右上角分享
点击右上角分享
0
联系我们:info@booksci.cn Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。 Copyright © 2023 布克学术 All rights reserved.
京ICP备2023020795号-1
ghs 京公网安备 11010802042870号
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术官方微信