准确的基于内容的视频复制检测与高效的特征索引

Proceedings of the 1st ACM International Conference on Multimedia Retrieval Pub Date : 2011-04-18 DOI:10.1145/1991996.1992015

Yusuke Uchida, M. Agrawal, S. Sakazawa

{"title":"准确的基于内容的视频复制检测与高效的特征索引","authors":"Yusuke Uchida, M. Agrawal, S. Sakazawa","doi":"10.1145/1991996.1992015","DOIUrl":null,"url":null,"abstract":"We describe an accurate content-based copy detection system that uses both local and global visual features to ensure robustness. Our system advances state-of-the-art techniques in four key directions. (1) Multiple-codebook-based product quantization: conventional product quantization methods encode feature vectors using a single codebook, resulting in large quantization error. We propose a novel codebook generation method for an arbitrary number of codebooks. (2) Handling of temporal burstiness: for a stationary scene, once a query feature matches incorrectly, the match continues in successive frames, resulting in a high false-alarm rate. We present a temporal-burstiness-aware scoring method that reduces the impact from similar features, thereby reducing false alarms. (3) Densely sampled SIFT descriptors: conventional global features suffer from a lack of distinctiveness and invariance to non-photometric transformations. Our densely sampled global SIFT features are more discriminative and robust against logo or pattern insertions. (4) Bigram- and multiple-assignment-based indexing for global features: we extract two SIFT descriptors from each location, which makes them more distinctive. To improve recall, we propose multiple assignments on both the query and reference sides. Performance evaluation on the TRECVID 2009 dataset indicates that both local and global approaches outperform conventional schemes. Furthermore, the integration of these two approaches achieves a three-fold reduction in the error rate when compared with the best performance reported in the TRECVID 2009 workshop.","PeriodicalId":390933,"journal":{"name":"Proceedings of the 1st ACM International Conference on Multimedia Retrieval","volume":"653 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2011-04-18","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"13","resultStr":"{\"title\":\"Accurate content-based video copy detection with efficient feature indexing\",\"authors\":\"Yusuke Uchida, M. Agrawal, S. Sakazawa\",\"doi\":\"10.1145/1991996.1992015\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"We describe an accurate content-based copy detection system that uses both local and global visual features to ensure robustness. Our system advances state-of-the-art techniques in four key directions. (1) Multiple-codebook-based product quantization: conventional product quantization methods encode feature vectors using a single codebook, resulting in large quantization error. We propose a novel codebook generation method for an arbitrary number of codebooks. (2) Handling of temporal burstiness: for a stationary scene, once a query feature matches incorrectly, the match continues in successive frames, resulting in a high false-alarm rate. We present a temporal-burstiness-aware scoring method that reduces the impact from similar features, thereby reducing false alarms. (3) Densely sampled SIFT descriptors: conventional global features suffer from a lack of distinctiveness and invariance to non-photometric transformations. Our densely sampled global SIFT features are more discriminative and robust against logo or pattern insertions. (4) Bigram- and multiple-assignment-based indexing for global features: we extract two SIFT descriptors from each location, which makes them more distinctive. To improve recall, we propose multiple assignments on both the query and reference sides. Performance evaluation on the TRECVID 2009 dataset indicates that both local and global approaches outperform conventional schemes. Furthermore, the integration of these two approaches achieves a three-fold reduction in the error rate when compared with the best performance reported in the TRECVID 2009 workshop.\",\"PeriodicalId\":390933,\"journal\":{\"name\":\"Proceedings of the 1st ACM International Conference on Multimedia Retrieval\",\"volume\":\"653 1\",\"pages\":\"0\"},\"PeriodicalIF\":0.0000,\"publicationDate\":\"2011-04-18\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"13\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"Proceedings of the 1st ACM International Conference on Multimedia Retrieval\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/10.1145/1991996.1992015\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"Proceedings of the 1st ACM International Conference on Multimedia Retrieval","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1145/1991996.1992015","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}

引用次数: 13

摘要

我们描述了一个精确的基于内容的复制检测系统，该系统使用局部和全局视觉特征来确保鲁棒性。我们的系统在四个关键方向上推进了最先进的技术。(1)基于多码本的产品量化:传统的产品量化方法使用单个码本对特征向量进行编码，导致量化误差较大。针对任意数量的码本，提出了一种新的码本生成方法。(2)处理时间突发性:对于静止场景，一旦查询特征匹配不正确，在连续的帧中继续匹配，导致误报率很高。我们提出了一种时间突发感知评分方法，减少了类似特征的影响，从而减少了误报。(3)密集采样SIFT描述子:传统的全局特征对非光度变换缺乏显著性和不变性。我们密集采样的全局SIFT特征对标识或图案插入具有更强的判别性和鲁棒性。(4)基于Bigram和多重赋值的全局特征索引:我们从每个位置提取两个SIFT描述符，使它们更具独特性。为了提高查全率，我们在查询端和引用端都提出了多重赋值。在TRECVID 2009数据集上的性能评估表明，局部和全局方法都优于传统方法。此外，与TRECVID 2009研讨会上报告的最佳性能相比，这两种方法的集成使错误率降低了三倍。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

查看原文本刊更多论文

Accurate content-based video copy detection with efficient feature indexing

We describe an accurate content-based copy detection system that uses both local and global visual features to ensure robustness. Our system advances state-of-the-art techniques in four key directions. (1) Multiple-codebook-based product quantization: conventional product quantization methods encode feature vectors using a single codebook, resulting in large quantization error. We propose a novel codebook generation method for an arbitrary number of codebooks. (2) Handling of temporal burstiness: for a stationary scene, once a query feature matches incorrectly, the match continues in successive frames, resulting in a high false-alarm rate. We present a temporal-burstiness-aware scoring method that reduces the impact from similar features, thereby reducing false alarms. (3) Densely sampled SIFT descriptors: conventional global features suffer from a lack of distinctiveness and invariance to non-photometric transformations. Our densely sampled global SIFT features are more discriminative and robust against logo or pattern insertions. (4) Bigram- and multiple-assignment-based indexing for global features: we extract two SIFT descriptors from each location, which makes them more distinctive. To improve recall, we propose multiple assignments on both the query and reference sides. Performance evaluation on the TRECVID 2009 dataset indicates that both local and global approaches outperform conventional schemes. Furthermore, the integration of these two approaches achieves a three-fold reduction in the error rate when compared with the best performance reported in the TRECVID 2009 workshop.

求助全文

通过发布文献求助，成功后即可免费获取论文全文。去求助

来源期刊

Proceedings of the 1st ACM International Conference on Multimedia Retrieval

自引率

0.00%

发文量