{"title":"基于最小二乘iohmm的说话人识别","authors":"Niloy J. Mukherjee","doi":"10.1109/MMSP.2002.1203299","DOIUrl":null,"url":null,"abstract":"The purpose of the speaker recognition is to determine a speaker's identity from his/her speech utterances. Every speaker has his/her own physiological as well as behavioral characteristics embedded in his/her speech utterances. These characteristics can be extracted from utterances and statistically modeled. Through pattern recognition of unseen test speech with statistically trained models, a speaker identity can be recognized. In this paper, we present a discriminative classification based approach for speaker recognition. The system makes use of regularized least squares regression (RLSR) based input output hidden Markov models (IOHMM) as classifier for closed set, text independent speaker identification. The IOHMM allows us to map input sequences to output sequences, using the same processing style as recurrent neural networks. The RLSR allows the IOHMM to be trained in a more discriminative style. The use of hidden Markov models (HMM) and support vector machines (SVM) has also been studied. The performance of the system is assessed using a set of male and female speakers drawn from the TIMIT corpus.","PeriodicalId":398813,"journal":{"name":"2002 IEEE Workshop on Multimedia Signal Processing.","volume":"17 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2002-12-09","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"1","resultStr":"{\"title\":\"Speaker recognition using least squares IOHMMs\",\"authors\":\"Niloy J. Mukherjee\",\"doi\":\"10.1109/MMSP.2002.1203299\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"The purpose of the speaker recognition is to determine a speaker's identity from his/her speech utterances. Every speaker has his/her own physiological as well as behavioral characteristics embedded in his/her speech utterances. These characteristics can be extracted from utterances and statistically modeled. Through pattern recognition of unseen test speech with statistically trained models, a speaker identity can be recognized. In this paper, we present a discriminative classification based approach for speaker recognition. The system makes use of regularized least squares regression (RLSR) based input output hidden Markov models (IOHMM) as classifier for closed set, text independent speaker identification. The IOHMM allows us to map input sequences to output sequences, using the same processing style as recurrent neural networks. The RLSR allows the IOHMM to be trained in a more discriminative style. The use of hidden Markov models (HMM) and support vector machines (SVM) has also been studied. The performance of the system is assessed using a set of male and female speakers drawn from the TIMIT corpus.\",\"PeriodicalId\":398813,\"journal\":{\"name\":\"2002 IEEE Workshop on Multimedia Signal Processing.\",\"volume\":\"17 1\",\"pages\":\"0\"},\"PeriodicalIF\":0.0000,\"publicationDate\":\"2002-12-09\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"1\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"2002 IEEE Workshop on Multimedia Signal Processing.\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/10.1109/MMSP.2002.1203299\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"2002 IEEE Workshop on Multimedia Signal Processing.","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/MMSP.2002.1203299","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
The purpose of the speaker recognition is to determine a speaker's identity from his/her speech utterances. Every speaker has his/her own physiological as well as behavioral characteristics embedded in his/her speech utterances. These characteristics can be extracted from utterances and statistically modeled. Through pattern recognition of unseen test speech with statistically trained models, a speaker identity can be recognized. In this paper, we present a discriminative classification based approach for speaker recognition. The system makes use of regularized least squares regression (RLSR) based input output hidden Markov models (IOHMM) as classifier for closed set, text independent speaker identification. The IOHMM allows us to map input sequences to output sequences, using the same processing style as recurrent neural networks. The RLSR allows the IOHMM to be trained in a more discriminative style. The use of hidden Markov models (HMM) and support vector machines (SVM) has also been studied. The performance of the system is assessed using a set of male and female speakers drawn from the TIMIT corpus.