Olivier Augereau, N. Journet, A. Vialard, J. Domenger
{"title":"Improving Classification of an Industrial Document Image Database by Combining Visual and Textual Features","authors":"Olivier Augereau, N. Journet, A. Vialard, J. Domenger","doi":"10.1109/DAS.2014.44","DOIUrl":null,"url":null,"abstract":"The main contribution of this paper is a new method for classifying document images by combining textual features extracted with the Bag of Words (BoW) technique and visual features extracted with the Bag of Visual Words (BoVW) technique. The BoVW is widely used within the computer vision community for scene classification or object recognition but few applications for the classification of entire document images have been submitted. While previous attempts have been showing disappointing results by combining visual and textual features with the Borda-count technique, we're proposing here a combination through learning approach. Experiments conducted on a 1925 document image industrial database reveal that this fusion scheme significantly improves the classification performances. Our concluding contribution deals with the choosing and tuning of the BoW and/or BoVW techniques in an industrial context.","PeriodicalId":220495,"journal":{"name":"2014 11th IAPR International Workshop on Document Analysis Systems","volume":"33 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2014-04-07","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"21","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"2014 11th IAPR International Workshop on Document Analysis Systems","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/DAS.2014.44","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 21
Abstract
The main contribution of this paper is a new method for classifying document images by combining textual features extracted with the Bag of Words (BoW) technique and visual features extracted with the Bag of Visual Words (BoVW) technique. The BoVW is widely used within the computer vision community for scene classification or object recognition but few applications for the classification of entire document images have been submitted. While previous attempts have been showing disappointing results by combining visual and textual features with the Borda-count technique, we're proposing here a combination through learning approach. Experiments conducted on a 1925 document image industrial database reveal that this fusion scheme significantly improves the classification performances. Our concluding contribution deals with the choosing and tuning of the BoW and/or BoVW techniques in an industrial context.
本文的主要贡献是将word Bag (BoW)技术提取的文本特征和BoVW技术提取的视觉特征相结合,提出了一种新的文档图像分类方法。BoVW在计算机视觉领域被广泛用于场景分类或目标识别,但很少有应用于整个文档图像的分类。虽然之前的尝试将视觉和文本特征与borda计数技术相结合的结果令人失望,但我们在这里提出了一种通过学习方法结合的方法。在1925年文档图像工业数据库上进行的实验表明,该融合方案显著提高了分类性能。我们最后的贡献涉及在工业环境中选择和调整BoW和/或BoVW技术。