Text line segmentation of historical documents: a survey

IF 2.5 4区计算机科学 Q3 COMPUTER SCIENCE, ARTIFICIAL INTELLIGENCE

International Journal on Document Analysis and Recognition Pub Date : 2007-04-04 DOI:10.5555/1237480.1237483

Likforman-SulemLaurence, ZahourAbderrazak, TaconetBruno

引用次数: 8

Abstract

There is a huge amount of historical documents in libraries and in various National Archives that have not been exploited electronically. Although automatic reading of complete pages remains, in mo...

查看原文本刊更多论文

历史文献的文本行分割:综述

在图书馆和各个国家档案馆中有大量的历史文献没有被电子利用。虽然自动读取完整的页面仍然存在，在…

本文章由计算机程序翻译，如有差异，请以英文原文为准。

求助全文

约1分钟内获得全文求助全文

来源期刊

International Journal on Document Analysis and Recognition 工程技术-计算机：人工智能

CiteScore

6.20

自引率

4.30%

发文量

审稿时长

7.5 months

期刊介绍： The large number of existing documents and the production of a multitude of new ones every year raise important issues in efficient handling, retrieval and storage of these documents and the information which they contain. This has led to the emergence of new research domains dealing with the recognition by computers of the constituent elements of documents - including characters, symbols, text, lines, graphics, images, handwriting, signatures, etc. In addition, these new domains deal with automatic analyses of the overall physical and logical structures of documents, with the ultimate objective of a high-level understanding of their semantic content. We have also seen renewed interest in optical character recognition (OCR) and handwriting recognition during the last decade. Document analysis and recognition are obviously the next stage. Automatic, intelligent processing of documents is at the intersections of many fields of research, especially of computer vision, image analysis, pattern recognition and artificial intelligence, as well as studies on reading, handwriting and linguistics. Although quality document related publications continue to appear in journals dedicated to these domains, the community will benefit from having this journal as a focal point for archival literature dedicated to document analysis and recognition.