基于HMM的草书词识别系统中书写变体数量的确定

Seventh International Conference on Document Analysis and Recognition, 2003. Proceedings. Pub Date : 2003-08-03 DOI:10.1109/ICDAR.2003.1227644

M. Schambach

{"title":"基于HMM的草书词识别系统中书写变体数量的确定","authors":"M. Schambach","doi":"10.1109/ICDAR.2003.1227644","DOIUrl":null,"url":null,"abstract":"An important parameter for building a cursive script model is the number of different, relevant letter writing variants. An algorithm performing this task automatically by optimizing the number of letter models in an HMM-based script recognition system is presented. The algorithm iteratively modified selected letter models; for selection, quality measures like HMM distance and emission weight entropy are developed, and their correlation with recognition performance is shown. Theoretical measures for the selection of overall model complexity are presented, but best results are obtained by direct selection criteria: likelihood and recognition rate of training data. With the optimized models, an average improvement in recognition rate of up to 5.8 percent could be achieved.","PeriodicalId":249193,"journal":{"name":"Seventh International Conference on Document Analysis and Recognition, 2003. Proceedings.","volume":"41 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2003-08-03","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"12","resultStr":"{\"title\":\"Determination of the number of writing variants with an HMM based cursive word recognition system\",\"authors\":\"M. Schambach\",\"doi\":\"10.1109/ICDAR.2003.1227644\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"An important parameter for building a cursive script model is the number of different, relevant letter writing variants. An algorithm performing this task automatically by optimizing the number of letter models in an HMM-based script recognition system is presented. The algorithm iteratively modified selected letter models; for selection, quality measures like HMM distance and emission weight entropy are developed, and their correlation with recognition performance is shown. Theoretical measures for the selection of overall model complexity are presented, but best results are obtained by direct selection criteria: likelihood and recognition rate of training data. With the optimized models, an average improvement in recognition rate of up to 5.8 percent could be achieved.\",\"PeriodicalId\":249193,\"journal\":{\"name\":\"Seventh International Conference on Document Analysis and Recognition, 2003. Proceedings.\",\"volume\":\"41 1\",\"pages\":\"0\"},\"PeriodicalIF\":0.0000,\"publicationDate\":\"2003-08-03\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"12\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"Seventh International Conference on Document Analysis and Recognition, 2003. Proceedings.\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/10.1109/ICDAR.2003.1227644\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"Seventh International Conference on Document Analysis and Recognition, 2003. Proceedings.","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/ICDAR.2003.1227644","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}

引用次数: 12

摘要

构建草书模型的一个重要参数是不同的、相关的字母书写变体的数量。提出了一种基于hmm的文字识别系统中，通过优化字母模型数目来自动完成该任务的算法。该算法迭代修改选定字母模型;在选择方面，提出了HMM距离和发射权熵等质量度量，并给出了它们与识别性能的相关性。虽然提出了选择整体模型复杂性的理论度量，但通过直接选择标准:训练数据的似然和识别率获得了最好的结果。通过优化后的模型，识别率平均提高了5.8%。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

查看原文本刊更多论文

Determination of the number of writing variants with an HMM based cursive word recognition system

An important parameter for building a cursive script model is the number of different, relevant letter writing variants. An algorithm performing this task automatically by optimizing the number of letter models in an HMM-based script recognition system is presented. The algorithm iteratively modified selected letter models; for selection, quality measures like HMM distance and emission weight entropy are developed, and their correlation with recognition performance is shown. Theoretical measures for the selection of overall model complexity are presented, but best results are obtained by direct selection criteria: likelihood and recognition rate of training data. With the optimized models, an average improvement in recognition rate of up to 5.8 percent could be achieved.

求助全文

通过发布文献求助，成功后即可免费获取论文全文。去求助

来源期刊

Seventh International Conference on Document Analysis and Recognition, 2003. Proceedings.

自引率

0.00%

发文量