{"title":"Profile Based Information Retrieval from Printed Document Images","authors":"S. Abirami, D. Manjula","doi":"10.1109/CGIV.2007.67","DOIUrl":null,"url":null,"abstract":"This paper performs a profile based Information Retrieval from printed document image collections. Keywords are valuable indexing tools and if they can be identified at the image level, extensive computation during recognition will be avoided. Printed documents can be scanned to produce document images. Instead of converting entire document images into text equivalent, word profiles are identified to match the word images in Bilingual document images.(English and Tamil). During retrieval, the same profile could be extracted from the user specified word and can be matched with the word images in the document. This yields a faster result even in a quality-degraded document. This kind of Information Retrieval (Keyword Based Search) can be adapted in Digital Libraries, which employs digitized documents instead of text processing. This promotes efficient search in document images irrespective of the language.","PeriodicalId":433577,"journal":{"name":"Computer Graphics, Imaging and Visualisation (CGIV 2007)","volume":"44 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2007-08-14","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"8","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Computer Graphics, Imaging and Visualisation (CGIV 2007)","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/CGIV.2007.67","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 8
Abstract
This paper performs a profile based Information Retrieval from printed document image collections. Keywords are valuable indexing tools and if they can be identified at the image level, extensive computation during recognition will be avoided. Printed documents can be scanned to produce document images. Instead of converting entire document images into text equivalent, word profiles are identified to match the word images in Bilingual document images.(English and Tamil). During retrieval, the same profile could be extracted from the user specified word and can be matched with the word images in the document. This yields a faster result even in a quality-degraded document. This kind of Information Retrieval (Keyword Based Search) can be adapted in Digital Libraries, which employs digitized documents instead of text processing. This promotes efficient search in document images irrespective of the language.