一种集成语音背景模型的鲁棒说话人识别

[Proceedings] ICASSP-92: 1992 IEEE International Conference on Acoustics, Speech, and Signal Processing Pub Date : 1992-03-23 DOI:10.1109/ICASSP.1992.226089

Douglas A. Reynolds, R. Rose

{"title":"一种集成语音背景模型的鲁棒说话人识别","authors":"Douglas A. Reynolds, R. Rose","doi":"10.1109/ICASSP.1992.226089","DOIUrl":null,"url":null,"abstract":"A procedure for text-independent speaker identification in noisy environments where the interfering background signals cannot be characterized using traditional broadband or impulsive noise models is examined. In the procedure, both the speaker and the background processes are modeled using mixtures of Gaussians. Speaker and background models are integrated into a unified statistical framework allowing the decoupling of the underlying speech process from the noise corrupted observations via the expectation-minimization algorithm. Using this formalism, speaker model parameters are estimated in the presence of the background process, and a scoring procedure is implemented for computing the speaker likelihood in the noise corrupted environment. The performance was evaluated using a 16-speaker conversational speech database with both speech babble and white noise background processes.<<ETX>>","PeriodicalId":163713,"journal":{"name":"[Proceedings] ICASSP-92: 1992 IEEE International Conference on Acoustics, Speech, and Signal Processing","volume":"2 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"1992-03-23","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"10","resultStr":"{\"title\":\"An integrated speech-background model for robust speaker identification\",\"authors\":\"Douglas A. Reynolds, R. Rose\",\"doi\":\"10.1109/ICASSP.1992.226089\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"A procedure for text-independent speaker identification in noisy environments where the interfering background signals cannot be characterized using traditional broadband or impulsive noise models is examined. In the procedure, both the speaker and the background processes are modeled using mixtures of Gaussians. Speaker and background models are integrated into a unified statistical framework allowing the decoupling of the underlying speech process from the noise corrupted observations via the expectation-minimization algorithm. Using this formalism, speaker model parameters are estimated in the presence of the background process, and a scoring procedure is implemented for computing the speaker likelihood in the noise corrupted environment. The performance was evaluated using a 16-speaker conversational speech database with both speech babble and white noise background processes.<<ETX>>\",\"PeriodicalId\":163713,\"journal\":{\"name\":\"[Proceedings] ICASSP-92: 1992 IEEE International Conference on Acoustics, Speech, and Signal Processing\",\"volume\":\"2 1\",\"pages\":\"0\"},\"PeriodicalIF\":0.0000,\"publicationDate\":\"1992-03-23\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"10\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"[Proceedings] ICASSP-92: 1992 IEEE International Conference on Acoustics, Speech, and Signal Processing\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/10.1109/ICASSP.1992.226089\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"[Proceedings] ICASSP-92: 1992 IEEE International Conference on Acoustics, Speech, and Signal Processing","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/ICASSP.1992.226089","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}

引用次数: 10

摘要

在干扰背景信号无法用传统的宽带或脉冲噪声模型表征的嘈杂环境中，研究了一种与文本无关的说话人识别方法。在这个过程中，演讲者和背景过程都是使用高斯混合模型来建模的。说话人和背景模型被集成到一个统一的统计框架中，允许通过期望最小化算法将潜在的语音过程与噪声损坏的观察结果解耦。利用这种形式，在背景过程存在的情况下对说话人模型参数进行估计，并实现了在噪声破坏环境下计算说话人似然的评分程序。研究人员使用一个16人的会话语音数据库对其性能进行了评估，该数据库包含了咿呀学语和白噪声背景处理。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

查看原文本刊更多论文

An integrated speech-background model for robust speaker identification

A procedure for text-independent speaker identification in noisy environments where the interfering background signals cannot be characterized using traditional broadband or impulsive noise models is examined. In the procedure, both the speaker and the background processes are modeled using mixtures of Gaussians. Speaker and background models are integrated into a unified statistical framework allowing the decoupling of the underlying speech process from the noise corrupted observations via the expectation-minimization algorithm. Using this formalism, speaker model parameters are estimated in the presence of the background process, and a scoring procedure is implemented for computing the speaker likelihood in the noise corrupted environment. The performance was evaluated using a 16-speaker conversational speech database with both speech babble and white noise background processes.<>

求助全文

通过发布文献求助，成功后即可免费获取论文全文。去求助

来源期刊

[Proceedings] ICASSP-92: 1992 IEEE International Conference on Acoustics, Speech, and Signal Processing

自引率

0.00%

发文量