{"title":"Statistical Text Line Analysis in Handwritten Documents","authors":"Vicente Bosch, A. Rossi, E. Vidal","doi":"10.1109/ICFHR.2012.274","DOIUrl":null,"url":null,"abstract":"In this paper we present an approach for text line analysis and detection in handwritten documents based on Hidden Markov Models, a technique widely used in other handwritten and speech recognition tasks. It is shown that text line analysis and detection can be solved using a more formal methodology in contraposition to most of the proposed heuristic approaches found in the literature. Our approach not only provides the best position coordinates for each of the vertical page regions but also labels them, in this manner surpassing the traditional heuristic methods. In our experiments we demonstrate the performance of the approach (both in line analysis and detection) and study the impact of increasingly constrained \"vertical layout language models\" on text line detection accuracy. Through this experimentation we also show the improvement in quality of the baselines yielded by our approach in comparison with a state-of-the-art heuristic method based on vertical projection profiles.","PeriodicalId":291062,"journal":{"name":"2012 International Conference on Frontiers in Handwriting Recognition","volume":"100 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2012-09-18","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"23","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"2012 International Conference on Frontiers in Handwriting Recognition","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/ICFHR.2012.274","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 23
Abstract
In this paper we present an approach for text line analysis and detection in handwritten documents based on Hidden Markov Models, a technique widely used in other handwritten and speech recognition tasks. It is shown that text line analysis and detection can be solved using a more formal methodology in contraposition to most of the proposed heuristic approaches found in the literature. Our approach not only provides the best position coordinates for each of the vertical page regions but also labels them, in this manner surpassing the traditional heuristic methods. In our experiments we demonstrate the performance of the approach (both in line analysis and detection) and study the impact of increasingly constrained "vertical layout language models" on text line detection accuracy. Through this experimentation we also show the improvement in quality of the baselines yielded by our approach in comparison with a state-of-the-art heuristic method based on vertical projection profiles.