{"title":"A robust audio fingerprinting method for content-based copy detection","authors":"Chahid Ouali, P. Dumouchel, Vishwa Gupta","doi":"10.1109/CBMI.2014.6849814","DOIUrl":null,"url":null,"abstract":"This paper presents a novel audio fingerprinting method that is highly robust to a variety of audio distortions. It is based on unconventional audio fingerprints generation scheme. The robustness is achieved by generating different versions of the spectrogram matrix of the audio signal by using a threshold based on the average of the spectral values to prune this matrix. We transform each version of this pruned spectrogram matrix into a 2-D binary image. Multiple 2-D images suppress noise to a varying degree. This varying degree of noise suppression improves likelihood of one of the images matching a reference image. To speed up matching, we convert each image into an n-dimensional vector, and perform a nearest neighbor search based on this n-dimensional vector. We test this method on TRECVID 2010 content-based copy detection evaluation dataset. Experimental results show the effectiveness of such fingerprints even when the audio is distorted. We compare the proposed method to a state-of-the-art audio copy detection system. Results of this comparison show that our method achieves an improvement of 22% in localization accuracy, and lowers minimal normalized detection cost rate (min NDCR) by half for audio transformations T1 and T2.","PeriodicalId":103056,"journal":{"name":"2014 12th International Workshop on Content-Based Multimedia Indexing (CBMI)","volume":"30 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2014-06-18","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"24","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"2014 12th International Workshop on Content-Based Multimedia Indexing (CBMI)","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/CBMI.2014.6849814","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 24
Abstract
This paper presents a novel audio fingerprinting method that is highly robust to a variety of audio distortions. It is based on unconventional audio fingerprints generation scheme. The robustness is achieved by generating different versions of the spectrogram matrix of the audio signal by using a threshold based on the average of the spectral values to prune this matrix. We transform each version of this pruned spectrogram matrix into a 2-D binary image. Multiple 2-D images suppress noise to a varying degree. This varying degree of noise suppression improves likelihood of one of the images matching a reference image. To speed up matching, we convert each image into an n-dimensional vector, and perform a nearest neighbor search based on this n-dimensional vector. We test this method on TRECVID 2010 content-based copy detection evaluation dataset. Experimental results show the effectiveness of such fingerprints even when the audio is distorted. We compare the proposed method to a state-of-the-art audio copy detection system. Results of this comparison show that our method achieves an improvement of 22% in localization accuracy, and lowers minimal normalized detection cost rate (min NDCR) by half for audio transformations T1 and T2.