{"title":"Compensating for Mismatch in High-Level Speaker Recognition","authors":"William M. Campbell","doi":"10.1109/ODYSSEY.2006.248110","DOIUrl":null,"url":null,"abstract":"Speaker recognition using high-level features has been a successful area of exploration. Features obtained from many different levels-phones, words, prosodic events, etc.-are used to characterize the speaker. A good modeling technique for these features is the support vector machine (SVM). SVMs model the n-gram frequencies from speaker utterances in a high-dimensional SVM feature space and have shown excellent performance over a wide variety of high-level features. A complimentary method of recent exploration in SVM speaker recognition is the use of nuisance attributes projection (NAP). NAP removes directions from SVM feature space that are superfluous to the task of speaker recognition-channel information, session variability, etc. In this paper, we consider the application of NAP to high-level speaker recognition. We describe the difficulties in applying this method and propose solutions. We also conduct experiments showing that NAP can reduce variability in SVM feature space leading to improved performance","PeriodicalId":215883,"journal":{"name":"2006 IEEE Odyssey - The Speaker and Language Recognition Workshop","volume":"41 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2006-06-28","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"13","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"2006 IEEE Odyssey - The Speaker and Language Recognition Workshop","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/ODYSSEY.2006.248110","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 13
Abstract
Speaker recognition using high-level features has been a successful area of exploration. Features obtained from many different levels-phones, words, prosodic events, etc.-are used to characterize the speaker. A good modeling technique for these features is the support vector machine (SVM). SVMs model the n-gram frequencies from speaker utterances in a high-dimensional SVM feature space and have shown excellent performance over a wide variety of high-level features. A complimentary method of recent exploration in SVM speaker recognition is the use of nuisance attributes projection (NAP). NAP removes directions from SVM feature space that are superfluous to the task of speaker recognition-channel information, session variability, etc. In this paper, we consider the application of NAP to high-level speaker recognition. We describe the difficulties in applying this method and propose solutions. We also conduct experiments showing that NAP can reduce variability in SVM feature space leading to improved performance