Automatic Thai and English fonts identification without character recognition

2001 IEEE Pacific Rim Conference on Communications, Computers and Signal Processing (IEEE Cat. No.01CH37233) Pub Date : 2001-08-26 DOI:10.1109/PACRIM.2001.953705

B. Kruatrachue, Pongsakorn Piyatrakul

引用次数: 3

Abstract

This paper describes a simple and fast algorithm to detect Thai and English characters in a document without doing actual characters recognition. The document is segmented into strings of letters separated by a blank, then each string is identified using characters features and their writing positions. This method achieves 100% accuracy if the characters have clear head feature. But if this feature is not used 90% of the strings still can be identified. This identification provides more information about the character set so that OCR can recognize faster with better accuracy.

查看原文本刊更多论文

自动泰语和英语字体识别，不需要字符识别

本文描述了一种简单快速的算法，可以在不进行实际字符识别的情况下检测文档中的泰文和英文字符。文档被分割成由空白分隔的字母串，然后使用字符特征和它们的书写位置来识别每个字符串。该方法在字符头部特征清晰的情况下，准确率达到100%。但是如果不使用这个特性，仍然可以识别90%的字符串。这种标识提供了更多关于字符集的信息，以便OCR能够更快、更准确地识别。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

求助全文

约1分钟内获得全文求助全文

来源期刊

2001 IEEE Pacific Rim Conference on Communications, Computers and Signal Processing (IEEE Cat. No.01CH37233)

自引率

0.00%

发文量