研究在远场自动扬声器验证的i向量框架内使用调制光谱特征

Anderson R. Avila, F. Fraga, M. Sarria-Paja, T. Falk
{"title":"研究在远场自动扬声器验证的i向量框架内使用调制光谱特征","authors":"Anderson R. Avila, F. Fraga, M. Sarria-Paja, T. Falk","doi":"10.1109/ITS.2014.6948012","DOIUrl":null,"url":null,"abstract":"It is known that channel variability compromises automatic speaker recognition accuracy. However, little attention has been given so far to the detrimental effects encountered under reverberant environments. In this paper, we focus on the issue of automatic speaker verification (ASV) under several levels of room reverberation. Alternative auditory inspired features are explored. Specifically, we investigate whether the performance of the so-called modulation spectral features (MSFs) can overcome the well-known mel-frequency cepstral coefficients (MFCCs). Experiments were conducted with an ASV system based on the state-of-the-art i-vector. The main contribution of this paper is to verify if MSFs combined with i-vectors are able to present the same performance encountered in the literature regarding speech recognition and speaker identification systems in reverberant environment.","PeriodicalId":359348,"journal":{"name":"2014 International Telecommunications Symposium (ITS)","volume":"97 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2014-11-06","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"1","resultStr":"{\"title\":\"Investigating the use of modulation spectral features within an i-vector framework for far-field automatic speaker verification\",\"authors\":\"Anderson R. Avila, F. Fraga, M. Sarria-Paja, T. Falk\",\"doi\":\"10.1109/ITS.2014.6948012\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"It is known that channel variability compromises automatic speaker recognition accuracy. However, little attention has been given so far to the detrimental effects encountered under reverberant environments. In this paper, we focus on the issue of automatic speaker verification (ASV) under several levels of room reverberation. Alternative auditory inspired features are explored. Specifically, we investigate whether the performance of the so-called modulation spectral features (MSFs) can overcome the well-known mel-frequency cepstral coefficients (MFCCs). Experiments were conducted with an ASV system based on the state-of-the-art i-vector. The main contribution of this paper is to verify if MSFs combined with i-vectors are able to present the same performance encountered in the literature regarding speech recognition and speaker identification systems in reverberant environment.\",\"PeriodicalId\":359348,\"journal\":{\"name\":\"2014 International Telecommunications Symposium (ITS)\",\"volume\":\"97 1\",\"pages\":\"0\"},\"PeriodicalIF\":0.0000,\"publicationDate\":\"2014-11-06\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"1\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"2014 International Telecommunications Symposium (ITS)\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/10.1109/ITS.2014.6948012\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"2014 International Telecommunications Symposium (ITS)","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/ITS.2014.6948012","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 1

摘要

众所周知,通道可变性会影响自动说话人识别的准确性。然而,迄今为止,很少有人注意到在混响环境下所遇到的有害影响。本文主要研究了不同室内混响水平下的自动说话人验证问题。探索其他听觉启发特征。具体来说,我们研究了所谓的调制谱特征(msf)的性能是否能够克服众所周知的mel-frequency倒谱系数(MFCCs)。实验采用基于最先进i向量的ASV系统进行。本文的主要贡献是验证msf与i向量的结合是否能够在混响环境中呈现与文献中关于语音识别和说话人识别系统相同的性能。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
Investigating the use of modulation spectral features within an i-vector framework for far-field automatic speaker verification
It is known that channel variability compromises automatic speaker recognition accuracy. However, little attention has been given so far to the detrimental effects encountered under reverberant environments. In this paper, we focus on the issue of automatic speaker verification (ASV) under several levels of room reverberation. Alternative auditory inspired features are explored. Specifically, we investigate whether the performance of the so-called modulation spectral features (MSFs) can overcome the well-known mel-frequency cepstral coefficients (MFCCs). Experiments were conducted with an ASV system based on the state-of-the-art i-vector. The main contribution of this paper is to verify if MSFs combined with i-vectors are able to present the same performance encountered in the literature regarding speech recognition and speaker identification systems in reverberant environment.
求助全文
通过发布文献求助,成功后即可免费获取论文全文。 去求助
来源期刊
自引率
0.00%
发文量
0
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
确定
请完成安全验证×
copy
已复制链接
快去分享给好友吧!
我知道了
右上角分享
点击右上角分享
0
联系我们:info@booksci.cn Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。 Copyright © 2023 布克学术 All rights reserved.
京ICP备2023020795号-1
ghs 京公网安备 11010802042870号
Book学术文献互助
Book学术文献互助群
群 号:604180095
Book学术官方微信