自动泰语和英语字体识别，不需要字符识别

2001 IEEE Pacific Rim Conference on Communications, Computers and Signal Processing (IEEE Cat. No.01CH37233) Pub Date : 2001-08-26 DOI:10.1109/PACRIM.2001.953705

B. Kruatrachue, Pongsakorn Piyatrakul

{"title":"自动泰语和英语字体识别，不需要字符识别","authors":"B. Kruatrachue, Pongsakorn Piyatrakul","doi":"10.1109/PACRIM.2001.953705","DOIUrl":null,"url":null,"abstract":"This paper describes a simple and fast algorithm to detect Thai and English characters in a document without doing actual characters recognition. The document is segmented into strings of letters separated by a blank, then each string is identified using characters features and their writing positions. This method achieves 100% accuracy if the characters have clear head feature. But if this feature is not used 90% of the strings still can be identified. This identification provides more information about the character set so that OCR can recognize faster with better accuracy.","PeriodicalId":261724,"journal":{"name":"2001 IEEE Pacific Rim Conference on Communications, Computers and Signal Processing (IEEE Cat. No.01CH37233)","volume":"41 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2001-08-26","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"3","resultStr":"{\"title\":\"Automatic Thai and English fonts identification without character recognition\",\"authors\":\"B. Kruatrachue, Pongsakorn Piyatrakul\",\"doi\":\"10.1109/PACRIM.2001.953705\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"This paper describes a simple and fast algorithm to detect Thai and English characters in a document without doing actual characters recognition. The document is segmented into strings of letters separated by a blank, then each string is identified using characters features and their writing positions. This method achieves 100% accuracy if the characters have clear head feature. But if this feature is not used 90% of the strings still can be identified. This identification provides more information about the character set so that OCR can recognize faster with better accuracy.\",\"PeriodicalId\":261724,\"journal\":{\"name\":\"2001 IEEE Pacific Rim Conference on Communications, Computers and Signal Processing (IEEE Cat. No.01CH37233)\",\"volume\":\"41 1\",\"pages\":\"0\"},\"PeriodicalIF\":0.0000,\"publicationDate\":\"2001-08-26\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"3\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"2001 IEEE Pacific Rim Conference on Communications, Computers and Signal Processing (IEEE Cat. No.01CH37233)\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/10.1109/PACRIM.2001.953705\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"2001 IEEE Pacific Rim Conference on Communications, Computers and Signal Processing (IEEE Cat. No.01CH37233)","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/PACRIM.2001.953705","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}

引用次数: 3

摘要

本文描述了一种简单快速的算法，可以在不进行实际字符识别的情况下检测文档中的泰文和英文字符。文档被分割成由空白分隔的字母串，然后使用字符特征和它们的书写位置来识别每个字符串。该方法在字符头部特征清晰的情况下，准确率达到100%。但是如果不使用这个特性，仍然可以识别90%的字符串。这种标识提供了更多关于字符集的信息，以便OCR能够更快、更准确地识别。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

查看原文本刊更多论文

Automatic Thai and English fonts identification without character recognition

This paper describes a simple and fast algorithm to detect Thai and English characters in a document without doing actual characters recognition. The document is segmented into strings of letters separated by a blank, then each string is identified using characters features and their writing positions. This method achieves 100% accuracy if the characters have clear head feature. But if this feature is not used 90% of the strings still can be identified. This identification provides more information about the character set so that OCR can recognize faster with better accuracy.

求助全文

通过发布文献求助，成功后即可免费获取论文全文。去求助

来源期刊

2001 IEEE Pacific Rim Conference on Communications, Computers and Signal Processing (IEEE Cat. No.01CH37233)

自引率

0.00%

发文量