{"title":"Improving HMM-Based Chinese Handwriting Recognition Using Delta Features and Synthesized String Samples","authors":"Tonghua Su, Cheng-Lin Liu","doi":"10.1109/ICFHR.2010.18","DOIUrl":null,"url":null,"abstract":"The HMM-based segmentation-free strategy for Chinese handwriting recognition has the advantage of training without annotation of character boundaries. However, the recognition performance has been limited by the small number of string samples. In this paper, we explore two techniques to improve the performance. First, Delta features are added to the static ones for alleviating the conditional independence assumption of HMMs. We then investigate into techniques for synthesizing string samples from isolated character images. We show that synthesizing linguistically natural string samples utilizes isolated samples insufficiently. Instead, we draw character samples without replacement and concatenate them into string images through between-character gaps. Our experimental results demonstrate that both Delta features and synthesized string samples significantly improve the recognition performance. Combining these with a bigram language model, the recognition accuracy has been increased by 36-38% compared to our previous system.","PeriodicalId":335044,"journal":{"name":"2010 12th International Conference on Frontiers in Handwriting Recognition","volume":"20 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2010-11-16","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"11","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"2010 12th International Conference on Frontiers in Handwriting Recognition","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/ICFHR.2010.18","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 11
Abstract
The HMM-based segmentation-free strategy for Chinese handwriting recognition has the advantage of training without annotation of character boundaries. However, the recognition performance has been limited by the small number of string samples. In this paper, we explore two techniques to improve the performance. First, Delta features are added to the static ones for alleviating the conditional independence assumption of HMMs. We then investigate into techniques for synthesizing string samples from isolated character images. We show that synthesizing linguistically natural string samples utilizes isolated samples insufficiently. Instead, we draw character samples without replacement and concatenate them into string images through between-character gaps. Our experimental results demonstrate that both Delta features and synthesized string samples significantly improve the recognition performance. Combining these with a bigram language model, the recognition accuracy has been increased by 36-38% compared to our previous system.