研究在远场自动扬声器验证的i向量框架内使用调制光谱特征

2014 International Telecommunications Symposium (ITS) Pub Date : 2014-11-06 DOI:10.1109/ITS.2014.6948012

Anderson R. Avila, F. Fraga, M. Sarria-Paja, T. Falk

{"title":"研究在远场自动扬声器验证的i向量框架内使用调制光谱特征","authors":"Anderson R. Avila, F. Fraga, M. Sarria-Paja, T. Falk","doi":"10.1109/ITS.2014.6948012","DOIUrl":null,"url":null,"abstract":"It is known that channel variability compromises automatic speaker recognition accuracy. However, little attention has been given so far to the detrimental effects encountered under reverberant environments. In this paper, we focus on the issue of automatic speaker verification (ASV) under several levels of room reverberation. Alternative auditory inspired features are explored. Specifically, we investigate whether the performance of the so-called modulation spectral features (MSFs) can overcome the well-known mel-frequency cepstral coefficients (MFCCs). Experiments were conducted with an ASV system based on the state-of-the-art i-vector. The main contribution of this paper is to verify if MSFs combined with i-vectors are able to present the same performance encountered in the literature regarding speech recognition and speaker identification systems in reverberant environment.","PeriodicalId":359348,"journal":{"name":"2014 International Telecommunications Symposium (ITS)","volume":"97 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2014-11-06","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"1","resultStr":"{\"title\":\"Investigating the use of modulation spectral features within an i-vector framework for far-field automatic speaker verification\",\"authors\":\"Anderson R. Avila, F. Fraga, M. Sarria-Paja, T. Falk\",\"doi\":\"10.1109/ITS.2014.6948012\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"It is known that channel variability compromises automatic speaker recognition accuracy. However, little attention has been given so far to the detrimental effects encountered under reverberant environments. In this paper, we focus on the issue of automatic speaker verification (ASV) under several levels of room reverberation. Alternative auditory inspired features are explored. Specifically, we investigate whether the performance of the so-called modulation spectral features (MSFs) can overcome the well-known mel-frequency cepstral coefficients (MFCCs). Experiments were conducted with an ASV system based on the state-of-the-art i-vector. The main contribution of this paper is to verify if MSFs combined with i-vectors are able to present the same performance encountered in the literature regarding speech recognition and speaker identification systems in reverberant environment.\",\"PeriodicalId\":359348,\"journal\":{\"name\":\"2014 International Telecommunications Symposium (ITS)\",\"volume\":\"97 1\",\"pages\":\"0\"},\"PeriodicalIF\":0.0000,\"publicationDate\":\"2014-11-06\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"1\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"2014 International Telecommunications Symposium (ITS)\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/10.1109/ITS.2014.6948012\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"2014 International Telecommunications Symposium (ITS)","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/ITS.2014.6948012","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}

引用次数: 1

摘要

众所周知，通道可变性会影响自动说话人识别的准确性。然而，迄今为止，很少有人注意到在混响环境下所遇到的有害影响。本文主要研究了不同室内混响水平下的自动说话人验证问题。探索其他听觉启发特征。具体来说，我们研究了所谓的调制谱特征(msf)的性能是否能够克服众所周知的mel-frequency倒谱系数(MFCCs)。实验采用基于最先进i向量的ASV系统进行。本文的主要贡献是验证msf与i向量的结合是否能够在混响环境中呈现与文献中关于语音识别和说话人识别系统相同的性能。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

查看原文本刊更多论文

Investigating the use of modulation spectral features within an i-vector framework for far-field automatic speaker verification

It is known that channel variability compromises automatic speaker recognition accuracy. However, little attention has been given so far to the detrimental effects encountered under reverberant environments. In this paper, we focus on the issue of automatic speaker verification (ASV) under several levels of room reverberation. Alternative auditory inspired features are explored. Specifically, we investigate whether the performance of the so-called modulation spectral features (MSFs) can overcome the well-known mel-frequency cepstral coefficients (MFCCs). Experiments were conducted with an ASV system based on the state-of-the-art i-vector. The main contribution of this paper is to verify if MSFs combined with i-vectors are able to present the same performance encountered in the literature regarding speech recognition and speaker identification systems in reverberant environment.

求助全文

通过发布文献求助，成功后即可免费获取论文全文。去求助

来源期刊

2014 International Telecommunications Symposium (ITS)

自引率

0.00%

发文量