Jun Li, Shan Zhou, Junliang Xing, Changyin Sun, Weiming Hu
{"title":"An Efficient Approach to Web Near-Duplicate Image Detection","authors":"Jun Li, Shan Zhou, Junliang Xing, Changyin Sun, Weiming Hu","doi":"10.1109/ACPR.2013.101","DOIUrl":null,"url":null,"abstract":"This paper presents an improved bag-of-words (BoW) framework for detecting near-duplicates of images on the Web and makes three main contributions. Firstly, based on the SIFT feature descriptors, Locality-constrained Linear Coding (LLC) with the spatial pyramid is introduced to encode features. Secondly, a weighted Chi-square distance metric is proposed to compare two histograms, with an inverted indexing scheme for fast similarity evaluation. Thirdly, a 6K dataset consisting of eight categories of objects, which can also be applicable to image retrieval and classification, is built and will be made available to the public in the future. We verify our technique on two benchmarks: our 6K dataset and the publicly available University of Kentucky Benchmark (UKB). The promising experimental results demonstrate the effectiveness and efficiency of our approach for Web Near-Duplicate Image Detection (Web-NDID), which outperforms several state-of-the-art methods.","PeriodicalId":365633,"journal":{"name":"2013 2nd IAPR Asian Conference on Pattern Recognition","volume":"17 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2013-11-05","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"4","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"2013 2nd IAPR Asian Conference on Pattern Recognition","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/ACPR.2013.101","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 4
Abstract
This paper presents an improved bag-of-words (BoW) framework for detecting near-duplicates of images on the Web and makes three main contributions. Firstly, based on the SIFT feature descriptors, Locality-constrained Linear Coding (LLC) with the spatial pyramid is introduced to encode features. Secondly, a weighted Chi-square distance metric is proposed to compare two histograms, with an inverted indexing scheme for fast similarity evaluation. Thirdly, a 6K dataset consisting of eight categories of objects, which can also be applicable to image retrieval and classification, is built and will be made available to the public in the future. We verify our technique on two benchmarks: our 6K dataset and the publicly available University of Kentucky Benchmark (UKB). The promising experimental results demonstrate the effectiveness and efficiency of our approach for Web Near-Duplicate Image Detection (Web-NDID), which outperforms several state-of-the-art methods.