{"title":"Fast Genre Classification of Web Images Using Global and Local Features","authors":"Guo-Shuai Liu, Fei Yin, Zhenbo Luo, Cheng-Lin Liu","doi":"10.1109/ACPR.2017.84","DOIUrl":null,"url":null,"abstract":"A number of images are present on the Web and the number is increasing every day. To effectively mine the contents embedded in Web images, it is useful to classify the images into different types so that they can be fed to different procedures for detailed analysis, such as text and non-text image discrimination. We herein propose a hierarchical algorithm for efficiently classifying Web images into four classes, namely, natural scene images, born-digital images, scanned and cameracaptured paper documents, which are the most prevalent image types on the Web. Our algorithm consists of two stages; the first stage extracts global features reflecting the distributions of color, edge and gradient, and uses a support vector machine (SVM) classifier for preliminary classification. Images assigned low confidence by the first-stage classifier is processed by the second stage, which further extracts local texture features represented in the Bag-of-Words framework and uses another SVM classifier for final classification. In addition, we design two fusion strategies to train the second classifier and generate the final prediction label depending on the usage of local features in the second stage. To validate the effectiveness of our proposed method, we also build a database containing more than 55,000 images from various sources. On our test image set, we obtained an overall classification accuracy of 98.4% and the processing speed is over 27FPS on an Intel(R) Xeon(R) CPU (2.90GHz).","PeriodicalId":426561,"journal":{"name":"2017 4th IAPR Asian Conference on Pattern Recognition (ACPR)","volume":"21 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2017-11-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"9","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"2017 4th IAPR Asian Conference on Pattern Recognition (ACPR)","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/ACPR.2017.84","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 9
Abstract
A number of images are present on the Web and the number is increasing every day. To effectively mine the contents embedded in Web images, it is useful to classify the images into different types so that they can be fed to different procedures for detailed analysis, such as text and non-text image discrimination. We herein propose a hierarchical algorithm for efficiently classifying Web images into four classes, namely, natural scene images, born-digital images, scanned and cameracaptured paper documents, which are the most prevalent image types on the Web. Our algorithm consists of two stages; the first stage extracts global features reflecting the distributions of color, edge and gradient, and uses a support vector machine (SVM) classifier for preliminary classification. Images assigned low confidence by the first-stage classifier is processed by the second stage, which further extracts local texture features represented in the Bag-of-Words framework and uses another SVM classifier for final classification. In addition, we design two fusion strategies to train the second classifier and generate the final prediction label depending on the usage of local features in the second stage. To validate the effectiveness of our proposed method, we also build a database containing more than 55,000 images from various sources. On our test image set, we obtained an overall classification accuracy of 98.4% and the processing speed is over 27FPS on an Intel(R) Xeon(R) CPU (2.90GHz).
网络上有大量的图像,而且数量每天都在增加。为了有效地挖掘嵌入在Web图像中的内容,将图像分类为不同的类型是有用的,以便将它们提供给不同的程序进行详细分析,例如文本和非文本图像区分。本文提出了一种分层算法,将Web图像有效地分为四类,即自然场景图像、原生数字图像、扫描和相机捕获的纸质文档,这是Web上最常见的图像类型。我们的算法包括两个阶段;第一阶段提取反映颜色、边缘和梯度分布的全局特征,使用支持向量机(SVM)分类器进行初步分类。第一阶段分类器对低置信度的图像进行处理,第二阶段进一步提取Bag-of-Words框架中表示的局部纹理特征,并使用另一种SVM分类器进行最终分类。此外,我们设计了两种融合策略来训练第二阶段的分类器,并根据第二阶段使用的局部特征生成最终的预测标签。为了验证我们提出的方法的有效性,我们还建立了一个包含来自不同来源的55,000多张图像的数据库。在我们的测试图像集上,我们获得了98.4%的总体分类准确率,在Intel(R) Xeon(R) CPU (2.90GHz)上的处理速度超过27FPS。