基于最小二乘iohmm的说话人识别

2002 IEEE Workshop on Multimedia Signal Processing. Pub Date : 2002-12-09 DOI:10.1109/MMSP.2002.1203299

Niloy J. Mukherjee

{"title":"基于最小二乘iohmm的说话人识别","authors":"Niloy J. Mukherjee","doi":"10.1109/MMSP.2002.1203299","DOIUrl":null,"url":null,"abstract":"The purpose of the speaker recognition is to determine a speaker's identity from his/her speech utterances. Every speaker has his/her own physiological as well as behavioral characteristics embedded in his/her speech utterances. These characteristics can be extracted from utterances and statistically modeled. Through pattern recognition of unseen test speech with statistically trained models, a speaker identity can be recognized. In this paper, we present a discriminative classification based approach for speaker recognition. The system makes use of regularized least squares regression (RLSR) based input output hidden Markov models (IOHMM) as classifier for closed set, text independent speaker identification. The IOHMM allows us to map input sequences to output sequences, using the same processing style as recurrent neural networks. The RLSR allows the IOHMM to be trained in a more discriminative style. The use of hidden Markov models (HMM) and support vector machines (SVM) has also been studied. The performance of the system is assessed using a set of male and female speakers drawn from the TIMIT corpus.","PeriodicalId":398813,"journal":{"name":"2002 IEEE Workshop on Multimedia Signal Processing.","volume":"17 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2002-12-09","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"1","resultStr":"{\"title\":\"Speaker recognition using least squares IOHMMs\",\"authors\":\"Niloy J. Mukherjee\",\"doi\":\"10.1109/MMSP.2002.1203299\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"The purpose of the speaker recognition is to determine a speaker's identity from his/her speech utterances. Every speaker has his/her own physiological as well as behavioral characteristics embedded in his/her speech utterances. These characteristics can be extracted from utterances and statistically modeled. Through pattern recognition of unseen test speech with statistically trained models, a speaker identity can be recognized. In this paper, we present a discriminative classification based approach for speaker recognition. The system makes use of regularized least squares regression (RLSR) based input output hidden Markov models (IOHMM) as classifier for closed set, text independent speaker identification. The IOHMM allows us to map input sequences to output sequences, using the same processing style as recurrent neural networks. The RLSR allows the IOHMM to be trained in a more discriminative style. The use of hidden Markov models (HMM) and support vector machines (SVM) has also been studied. The performance of the system is assessed using a set of male and female speakers drawn from the TIMIT corpus.\",\"PeriodicalId\":398813,\"journal\":{\"name\":\"2002 IEEE Workshop on Multimedia Signal Processing.\",\"volume\":\"17 1\",\"pages\":\"0\"},\"PeriodicalIF\":0.0000,\"publicationDate\":\"2002-12-09\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"1\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"2002 IEEE Workshop on Multimedia Signal Processing.\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/10.1109/MMSP.2002.1203299\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"2002 IEEE Workshop on Multimedia Signal Processing.","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/MMSP.2002.1203299","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}

引用次数: 1

摘要

说话人识别的目的是从说话人的话语中确定说话人的身份。每个说话者都有他/她自己的生理和行为特征嵌入在他/她的言语中。这些特征可以从话语中提取出来并进行统计建模。通过统计训练模型对未见的测试语音进行模式识别，可以识别说话人的身份。本文提出了一种基于判别分类的说话人识别方法。该系统利用基于正则化最小二乘回归(RLSR)的输入输出隐马尔可夫模型(IOHMM)作为闭集、文本无关的说话人识别分类器。IOHMM允许我们将输入序列映射到输出序列，使用与循环神经网络相同的处理风格。RLSR允许IOHMM以更具判别性的风格进行训练。本文还研究了隐马尔可夫模型(HMM)和支持向量机(SVM)的应用。使用从TIMIT语料库中抽取的一组男性和女性发言者来评估该系统的性能。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

查看原文本刊更多论文

Speaker recognition using least squares IOHMMs

The purpose of the speaker recognition is to determine a speaker's identity from his/her speech utterances. Every speaker has his/her own physiological as well as behavioral characteristics embedded in his/her speech utterances. These characteristics can be extracted from utterances and statistically modeled. Through pattern recognition of unseen test speech with statistically trained models, a speaker identity can be recognized. In this paper, we present a discriminative classification based approach for speaker recognition. The system makes use of regularized least squares regression (RLSR) based input output hidden Markov models (IOHMM) as classifier for closed set, text independent speaker identification. The IOHMM allows us to map input sequences to output sequences, using the same processing style as recurrent neural networks. The RLSR allows the IOHMM to be trained in a more discriminative style. The use of hidden Markov models (HMM) and support vector machines (SVM) has also been studied. The performance of the system is assessed using a set of male and female speakers drawn from the TIMIT corpus.

求助全文

通过发布文献求助，成功后即可免费获取论文全文。去求助

来源期刊

2002 IEEE Workshop on Multimedia Signal Processing.

自引率

0.00%

发文量