Gaofeng Meng, N. Zheng, Yonghong Song, Yuanlin Zhang
{"title":"Document Images Retrieval Based on Multiple Features Combination","authors":"Gaofeng Meng, N. Zheng, Yonghong Song, Yuanlin Zhang","doi":"10.1109/ICDAR.2007.103","DOIUrl":null,"url":null,"abstract":"Retrieving the relevant document images from a great number of digitized pages with different kinds of artificial variations and documents quality deteriorations caused by scanning and printing is a meaningful and challenging problem. We attempt to deal with this problem by combining up multiple different kinds of document features in a hybrid way. Firstly, two new kinds of document image features based on the projection histograms and crossings number histograms of an image are proposed. Secondly, the proposed two features, together with density distribution feature and local binary pattern feature, are combined in a multistage structure to develop a novel document image retrieval system. Experimental results show that the proposed novel system is very efficient and robust for retrieving different kinds of document images, even if some of them are severely degraded.","PeriodicalId":279268,"journal":{"name":"Ninth International Conference on Document Analysis and Recognition (ICDAR 2007)","volume":"18 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2007-09-23","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"21","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Ninth International Conference on Document Analysis and Recognition (ICDAR 2007)","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/ICDAR.2007.103","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 21
Abstract
Retrieving the relevant document images from a great number of digitized pages with different kinds of artificial variations and documents quality deteriorations caused by scanning and printing is a meaningful and challenging problem. We attempt to deal with this problem by combining up multiple different kinds of document features in a hybrid way. Firstly, two new kinds of document image features based on the projection histograms and crossings number histograms of an image are proposed. Secondly, the proposed two features, together with density distribution feature and local binary pattern feature, are combined in a multistage structure to develop a novel document image retrieval system. Experimental results show that the proposed novel system is very efficient and robust for retrieving different kinds of document images, even if some of them are severely degraded.