{"title":"Variants of cepstrum based speaker identity verification","authors":"G. Velius","doi":"10.1109/ICASSP.1988.196652","DOIUrl":null,"url":null,"abstract":"Analysis parameters and various distance measures are investigated for a template matching scheme for speaker identity verification (SIV). Two parameters are systematically varied-the length of the signal analysis window, and the order of the linear predictive coding/-cepstrum analysis. Computational costs associated with the choice of parameters are also considered. The distance measures tested are the Euclidean, inverse variance weighting, differential mean weighting, Kahn's simplified weighting, the Mahalanobis distance, and the Fisher linear discriminant. Using the equal error rate (EER) of pairwise utterance dissimilarity distributions, performance is estimated for prespecified and (a simulation of) user-determined input vocabulary. Performance varies significantly across vocabulary, and average performance is approximately 5% EER for the better algorithms on telephone speech.<<ETX>>","PeriodicalId":448544,"journal":{"name":"ICASSP-88., International Conference on Acoustics, Speech, and Signal Processing","volume":"1 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"1988-04-11","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"23","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"ICASSP-88., International Conference on Acoustics, Speech, and Signal Processing","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/ICASSP.1988.196652","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 23
Abstract
Analysis parameters and various distance measures are investigated for a template matching scheme for speaker identity verification (SIV). Two parameters are systematically varied-the length of the signal analysis window, and the order of the linear predictive coding/-cepstrum analysis. Computational costs associated with the choice of parameters are also considered. The distance measures tested are the Euclidean, inverse variance weighting, differential mean weighting, Kahn's simplified weighting, the Mahalanobis distance, and the Fisher linear discriminant. Using the equal error rate (EER) of pairwise utterance dissimilarity distributions, performance is estimated for prespecified and (a simulation of) user-determined input vocabulary. Performance varies significantly across vocabulary, and average performance is approximately 5% EER for the better algorithms on telephone speech.<>