利用Delta特征和合成字符串样本改进基于hmm的中文手写识别

Tonghua Su, Cheng-Lin Liu
{"title":"利用Delta特征和合成字符串样本改进基于hmm的中文手写识别","authors":"Tonghua Su, Cheng-Lin Liu","doi":"10.1109/ICFHR.2010.18","DOIUrl":null,"url":null,"abstract":"The HMM-based segmentation-free strategy for Chinese handwriting recognition has the advantage of training without annotation of character boundaries. However, the recognition performance has been limited by the small number of string samples. In this paper, we explore two techniques to improve the performance. First, Delta features are added to the static ones for alleviating the conditional independence assumption of HMMs. We then investigate into techniques for synthesizing string samples from isolated character images. We show that synthesizing linguistically natural string samples utilizes isolated samples insufficiently. Instead, we draw character samples without replacement and concatenate them into string images through between-character gaps. Our experimental results demonstrate that both Delta features and synthesized string samples significantly improve the recognition performance. Combining these with a bigram language model, the recognition accuracy has been increased by 36-38% compared to our previous system.","PeriodicalId":335044,"journal":{"name":"2010 12th International Conference on Frontiers in Handwriting Recognition","volume":"20 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2010-11-16","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"11","resultStr":"{\"title\":\"Improving HMM-Based Chinese Handwriting Recognition Using Delta Features and Synthesized String Samples\",\"authors\":\"Tonghua Su, Cheng-Lin Liu\",\"doi\":\"10.1109/ICFHR.2010.18\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"The HMM-based segmentation-free strategy for Chinese handwriting recognition has the advantage of training without annotation of character boundaries. However, the recognition performance has been limited by the small number of string samples. In this paper, we explore two techniques to improve the performance. First, Delta features are added to the static ones for alleviating the conditional independence assumption of HMMs. We then investigate into techniques for synthesizing string samples from isolated character images. We show that synthesizing linguistically natural string samples utilizes isolated samples insufficiently. Instead, we draw character samples without replacement and concatenate them into string images through between-character gaps. Our experimental results demonstrate that both Delta features and synthesized string samples significantly improve the recognition performance. Combining these with a bigram language model, the recognition accuracy has been increased by 36-38% compared to our previous system.\",\"PeriodicalId\":335044,\"journal\":{\"name\":\"2010 12th International Conference on Frontiers in Handwriting Recognition\",\"volume\":\"20 1\",\"pages\":\"0\"},\"PeriodicalIF\":0.0000,\"publicationDate\":\"2010-11-16\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"11\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"2010 12th International Conference on Frontiers in Handwriting Recognition\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/10.1109/ICFHR.2010.18\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"2010 12th International Conference on Frontiers in Handwriting Recognition","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/ICFHR.2010.18","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 11

摘要

基于hmm的汉字手写识别无分词策略具有无需标注字符边界即可进行训练的优点。然而,由于字符串样本数量少,识别性能受到限制。在本文中,我们探讨了两种技术来提高性能。首先,在静态特征的基础上加入Delta特征,减轻hmm的条件独立性假设。然后,我们研究了从孤立字符图像合成字符串样本的技术。我们表明,合成语言上自然的字符串样本利用孤立的样本不足。相反,我们绘制不替换的字符样本,并通过字符间隙将它们连接到字符串图像中。实验结果表明,Delta特征和合成的字符串样本都能显著提高识别性能。将这些与双字语言模型相结合,识别准确率比之前的系统提高了36-38%。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
Improving HMM-Based Chinese Handwriting Recognition Using Delta Features and Synthesized String Samples
The HMM-based segmentation-free strategy for Chinese handwriting recognition has the advantage of training without annotation of character boundaries. However, the recognition performance has been limited by the small number of string samples. In this paper, we explore two techniques to improve the performance. First, Delta features are added to the static ones for alleviating the conditional independence assumption of HMMs. We then investigate into techniques for synthesizing string samples from isolated character images. We show that synthesizing linguistically natural string samples utilizes isolated samples insufficiently. Instead, we draw character samples without replacement and concatenate them into string images through between-character gaps. Our experimental results demonstrate that both Delta features and synthesized string samples significantly improve the recognition performance. Combining these with a bigram language model, the recognition accuracy has been increased by 36-38% compared to our previous system.
求助全文
通过发布文献求助,成功后即可免费获取论文全文。 去求助
来源期刊
自引率
0.00%
发文量
0
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
确定
请完成安全验证×
copy
已复制链接
快去分享给好友吧!
我知道了
右上角分享
点击右上角分享
0
联系我们:info@booksci.cn Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。 Copyright © 2023 布克学术 All rights reserved.
京ICP备2023020795号-1
ghs 京公网安备 11010802042870号
Book学术文献互助
Book学术文献互助群
群 号:604180095
Book学术官方微信