{"title":"Automatic cell phone recognition from speech recordings","authors":"Ling Zou, Ji-Chen Yang, Tangsen Huang","doi":"10.1109/ChinaSIP.2014.6889318","DOIUrl":null,"url":null,"abstract":"Recording device recognition is an important research field of digital audio forensic. In this paper, we utilize Gaussian mixture model-universal background model (GMM-UBM) as the classifier to form a recording device recognition system. We examine the performance of Mel-frequency cepstral coefficients (MFCCs) and Power-normalized cepstral coefficients (PNCCs) to this problem. Experiments conducted on recordings come from 14 cell phones show that MFCCs are more effective than PNCCs in cell phone recognition. We find that the identification performance can be improved by stacking MFCCs and energy feature. We also investigate the effect of speaker mismatch and de-noising processing for acoustic feature to this problem. The highest identification accuracy achieved here is 97.71%.","PeriodicalId":248977,"journal":{"name":"2014 IEEE China Summit & International Conference on Signal and Information Processing (ChinaSIP)","volume":"26 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2014-07-09","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"15","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"2014 IEEE China Summit & International Conference on Signal and Information Processing (ChinaSIP)","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/ChinaSIP.2014.6889318","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 15
Abstract
Recording device recognition is an important research field of digital audio forensic. In this paper, we utilize Gaussian mixture model-universal background model (GMM-UBM) as the classifier to form a recording device recognition system. We examine the performance of Mel-frequency cepstral coefficients (MFCCs) and Power-normalized cepstral coefficients (PNCCs) to this problem. Experiments conducted on recordings come from 14 cell phones show that MFCCs are more effective than PNCCs in cell phone recognition. We find that the identification performance can be improved by stacking MFCCs and energy feature. We also investigate the effect of speaker mismatch and de-noising processing for acoustic feature to this problem. The highest identification accuracy achieved here is 97.71%.