手写地址识别与开放词汇使用字符n-grams

Proceedings Eighth International Workshop on Frontiers in Handwriting Recognition Pub Date : 2002-08-06 DOI:10.1109/IWFHR.2002.1030936

A. Brakensiek, J. Rottland, G. Rigoll

{"title":"手写地址识别与开放词汇使用字符n-grams","authors":"A. Brakensiek, J. Rottland, G. Rigoll","doi":"10.1109/IWFHR.2002.1030936","DOIUrl":null,"url":null,"abstract":"In this paper a recognition system, based on tied-mixture hidden Markov models, for handwritten address words is described, which makes use of a language model that consists of backoff character n-grams. For a dictionary-based recognition system it is essential that the structure of the address (name, street, city) is known. If the single parts of the address cannot be categorized, the used vocabulary is unknown and thus unlimited. The performance of this open vocabulary recognition using n-grams is compared to the use of dictionaries of different sizes. Especially, the confidence of recognition results and the possibility of a useful post-processing are significant advantages of language models.","PeriodicalId":114017,"journal":{"name":"Proceedings Eighth International Workshop on Frontiers in Handwriting Recognition","volume":"1 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2002-08-06","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"26","resultStr":"{\"title\":\"Handwritten address recognition with open vocabulary using character n-grams\",\"authors\":\"A. Brakensiek, J. Rottland, G. Rigoll\",\"doi\":\"10.1109/IWFHR.2002.1030936\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"In this paper a recognition system, based on tied-mixture hidden Markov models, for handwritten address words is described, which makes use of a language model that consists of backoff character n-grams. For a dictionary-based recognition system it is essential that the structure of the address (name, street, city) is known. If the single parts of the address cannot be categorized, the used vocabulary is unknown and thus unlimited. The performance of this open vocabulary recognition using n-grams is compared to the use of dictionaries of different sizes. Especially, the confidence of recognition results and the possibility of a useful post-processing are significant advantages of language models.\",\"PeriodicalId\":114017,\"journal\":{\"name\":\"Proceedings Eighth International Workshop on Frontiers in Handwriting Recognition\",\"volume\":\"1 1\",\"pages\":\"0\"},\"PeriodicalIF\":0.0000,\"publicationDate\":\"2002-08-06\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"26\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"Proceedings Eighth International Workshop on Frontiers in Handwriting Recognition\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/10.1109/IWFHR.2002.1030936\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"Proceedings Eighth International Workshop on Frontiers in Handwriting Recognition","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/IWFHR.2002.1030936","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}

引用次数: 26

摘要

本文描述了一种基于捆绑混合隐马尔可夫模型的手写体地址词识别系统，该系统利用由后退字符n-图组成的语言模型。对于基于字典的识别系统来说，知道地址(姓名、街道、城市)的结构是很重要的。如果地址的单个部分无法分类，则使用的词汇是未知的，因此是无限的。将使用n-grams的这种开放词汇识别的性能与使用不同大小的字典进行了比较。特别是，识别结果的置信度和有用的后处理的可能性是语言模型的显著优势。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

查看原文本刊更多论文

Handwritten address recognition with open vocabulary using character n-grams

In this paper a recognition system, based on tied-mixture hidden Markov models, for handwritten address words is described, which makes use of a language model that consists of backoff character n-grams. For a dictionary-based recognition system it is essential that the structure of the address (name, street, city) is known. If the single parts of the address cannot be categorized, the used vocabulary is unknown and thus unlimited. The performance of this open vocabulary recognition using n-grams is compared to the use of dictionaries of different sizes. Especially, the confidence of recognition results and the possibility of a useful post-processing are significant advantages of language models.

求助全文

通过发布文献求助，成功后即可免费获取论文全文。去求助

来源期刊

Proceedings Eighth International Workshop on Frontiers in Handwriting Recognition

自引率

0.00%

发文量