Andreas Fischer, M. Baechler, A. Garz, M. Liwicki, R. Ingold
{"title":"历史文献文本行提取与手写识别的组合系统","authors":"Andreas Fischer, M. Baechler, A. Garz, M. Liwicki, R. Ingold","doi":"10.1109/DAS.2014.51","DOIUrl":null,"url":null,"abstract":"Automated reading of historical handwriting is needed to search and browse ancient manuscripts in digital libraries based on their textual content. In this paper, we present a combined system for text localization and transcription in page images. It includes flexible learning-based methods for layout analysis and handwriting recognition, which were developed in the context of the Swiss research project HisDoc. A comprehensive experimental evaluation is provided for the medieval Parzival database, demonstrating a promising word recognition accuracy of 93.0% with closed vocabulary. In order to harmonize the evaluation of the two document analysis tasks, we introduce a novel evaluation measure for text line extraction that takes substitution, deletion, as well as insertion errors into account.","PeriodicalId":220495,"journal":{"name":"2014 11th IAPR International Workshop on Document Analysis Systems","volume":"8 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2014-04-07","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"24","resultStr":"{\"title\":\"A Combined System for Text Line Extraction and Handwriting Recognition in Historical Documents\",\"authors\":\"Andreas Fischer, M. Baechler, A. Garz, M. Liwicki, R. Ingold\",\"doi\":\"10.1109/DAS.2014.51\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"Automated reading of historical handwriting is needed to search and browse ancient manuscripts in digital libraries based on their textual content. In this paper, we present a combined system for text localization and transcription in page images. It includes flexible learning-based methods for layout analysis and handwriting recognition, which were developed in the context of the Swiss research project HisDoc. A comprehensive experimental evaluation is provided for the medieval Parzival database, demonstrating a promising word recognition accuracy of 93.0% with closed vocabulary. In order to harmonize the evaluation of the two document analysis tasks, we introduce a novel evaluation measure for text line extraction that takes substitution, deletion, as well as insertion errors into account.\",\"PeriodicalId\":220495,\"journal\":{\"name\":\"2014 11th IAPR International Workshop on Document Analysis Systems\",\"volume\":\"8 1\",\"pages\":\"0\"},\"PeriodicalIF\":0.0000,\"publicationDate\":\"2014-04-07\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"24\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"2014 11th IAPR International Workshop on Document Analysis Systems\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/10.1109/DAS.2014.51\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"2014 11th IAPR International Workshop on Document Analysis Systems","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/DAS.2014.51","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
A Combined System for Text Line Extraction and Handwriting Recognition in Historical Documents
Automated reading of historical handwriting is needed to search and browse ancient manuscripts in digital libraries based on their textual content. In this paper, we present a combined system for text localization and transcription in page images. It includes flexible learning-based methods for layout analysis and handwriting recognition, which were developed in the context of the Swiss research project HisDoc. A comprehensive experimental evaluation is provided for the medieval Parzival database, demonstrating a promising word recognition accuracy of 93.0% with closed vocabulary. In order to harmonize the evaluation of the two document analysis tasks, we introduce a novel evaluation measure for text line extraction that takes substitution, deletion, as well as insertion errors into account.