{"title":"Transcript mapping for historic handwritten document images","authors":"C. Tomai, Bin Zhang, Venu Govindaraju","doi":"10.1109/IWFHR.2002.1030945","DOIUrl":null,"url":null,"abstract":"There is a large number of scanned historical documents that need to be indexed for archival and retrieval purposes. A visual word spotting scheme that would serve these purposes is a challenging task even when the transcription of the document image is available. We propose a framework for mapping each word in the transcript to the associated word image in the document. Coarse word mapping based on document constraints is used for lexicon reduction. Then, word mappings are refined using word recognition results by a dynamic programming algorithm that finds the best match while satisfying the constraints.","PeriodicalId":114017,"journal":{"name":"Proceedings Eighth International Workshop on Frontiers in Handwriting Recognition","volume":"39 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2002-08-06","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"87","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Proceedings Eighth International Workshop on Frontiers in Handwriting Recognition","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/IWFHR.2002.1030945","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 87
Abstract
There is a large number of scanned historical documents that need to be indexed for archival and retrieval purposes. A visual word spotting scheme that would serve these purposes is a challenging task even when the transcription of the document image is available. We propose a framework for mapping each word in the transcript to the associated word image in the document. Coarse word mapping based on document constraints is used for lexicon reduction. Then, word mappings are refined using word recognition results by a dynamic programming algorithm that finds the best match while satisfying the constraints.