P. Natarajan, Krishna Subramanian, Anurag Bhardwaj, R. Prasad
{"title":"离线手写识别的随机分段建模","authors":"P. Natarajan, Krishna Subramanian, Anurag Bhardwaj, R. Prasad","doi":"10.1109/ICDAR.2009.278","DOIUrl":null,"url":null,"abstract":"In this paper, we present a novel approach for incorporating structural information into the hidden Markov Modeling (HMM) framework for offline handwriting recognition. Traditionally, structural features have been used in recognition approaches that rely on accurate segmentation of words into smaller units (sub-words or characters). However, such segmentation based approaches do not perform well on real-world handwritten images, because breaks and merges in glyphs typically create new connected components that are not observed in the training data. To mitigate the problem of having to derive accurate segmentation from connected components, we present a novel framework where the HMM based recognition system trained on shorter-span features is used to generate the 2-D character images (the “Stochastic Segments”), and then another classifier that uses structural features extracted from the stochastic character segments generates a new set of scores. Finally, the scores from the HMM system and from structural matching are used in combination to generate a hypothesis that is better than the results from either the HMM or from structural matching alone. We demonstrate the efficacy of our approach by reporting experimental results on a large corpus of handwritten Arabic documents.","PeriodicalId":433762,"journal":{"name":"2009 10th International Conference on Document Analysis and Recognition","volume":"12 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2009-07-26","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"17","resultStr":"{\"title\":\"Stochastic Segment Modeling for Offline Handwriting Recognition\",\"authors\":\"P. Natarajan, Krishna Subramanian, Anurag Bhardwaj, R. Prasad\",\"doi\":\"10.1109/ICDAR.2009.278\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"In this paper, we present a novel approach for incorporating structural information into the hidden Markov Modeling (HMM) framework for offline handwriting recognition. Traditionally, structural features have been used in recognition approaches that rely on accurate segmentation of words into smaller units (sub-words or characters). However, such segmentation based approaches do not perform well on real-world handwritten images, because breaks and merges in glyphs typically create new connected components that are not observed in the training data. To mitigate the problem of having to derive accurate segmentation from connected components, we present a novel framework where the HMM based recognition system trained on shorter-span features is used to generate the 2-D character images (the “Stochastic Segments”), and then another classifier that uses structural features extracted from the stochastic character segments generates a new set of scores. Finally, the scores from the HMM system and from structural matching are used in combination to generate a hypothesis that is better than the results from either the HMM or from structural matching alone. We demonstrate the efficacy of our approach by reporting experimental results on a large corpus of handwritten Arabic documents.\",\"PeriodicalId\":433762,\"journal\":{\"name\":\"2009 10th International Conference on Document Analysis and Recognition\",\"volume\":\"12 1\",\"pages\":\"0\"},\"PeriodicalIF\":0.0000,\"publicationDate\":\"2009-07-26\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"17\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"2009 10th International Conference on Document Analysis and Recognition\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/10.1109/ICDAR.2009.278\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"2009 10th International Conference on Document Analysis and Recognition","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/ICDAR.2009.278","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
Stochastic Segment Modeling for Offline Handwriting Recognition
In this paper, we present a novel approach for incorporating structural information into the hidden Markov Modeling (HMM) framework for offline handwriting recognition. Traditionally, structural features have been used in recognition approaches that rely on accurate segmentation of words into smaller units (sub-words or characters). However, such segmentation based approaches do not perform well on real-world handwritten images, because breaks and merges in glyphs typically create new connected components that are not observed in the training data. To mitigate the problem of having to derive accurate segmentation from connected components, we present a novel framework where the HMM based recognition system trained on shorter-span features is used to generate the 2-D character images (the “Stochastic Segments”), and then another classifier that uses structural features extracted from the stochastic character segments generates a new set of scores. Finally, the scores from the HMM system and from structural matching are used in combination to generate a hypothesis that is better than the results from either the HMM or from structural matching alone. We demonstrate the efficacy of our approach by reporting experimental results on a large corpus of handwritten Arabic documents.