Anderson R. Avila, F. Fraga, M. Sarria-Paja, T. Falk
{"title":"研究在远场自动扬声器验证的i向量框架内使用调制光谱特征","authors":"Anderson R. Avila, F. Fraga, M. Sarria-Paja, T. Falk","doi":"10.1109/ITS.2014.6948012","DOIUrl":null,"url":null,"abstract":"It is known that channel variability compromises automatic speaker recognition accuracy. However, little attention has been given so far to the detrimental effects encountered under reverberant environments. In this paper, we focus on the issue of automatic speaker verification (ASV) under several levels of room reverberation. Alternative auditory inspired features are explored. Specifically, we investigate whether the performance of the so-called modulation spectral features (MSFs) can overcome the well-known mel-frequency cepstral coefficients (MFCCs). Experiments were conducted with an ASV system based on the state-of-the-art i-vector. The main contribution of this paper is to verify if MSFs combined with i-vectors are able to present the same performance encountered in the literature regarding speech recognition and speaker identification systems in reverberant environment.","PeriodicalId":359348,"journal":{"name":"2014 International Telecommunications Symposium (ITS)","volume":"97 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2014-11-06","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"1","resultStr":"{\"title\":\"Investigating the use of modulation spectral features within an i-vector framework for far-field automatic speaker verification\",\"authors\":\"Anderson R. Avila, F. Fraga, M. Sarria-Paja, T. Falk\",\"doi\":\"10.1109/ITS.2014.6948012\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"It is known that channel variability compromises automatic speaker recognition accuracy. However, little attention has been given so far to the detrimental effects encountered under reverberant environments. In this paper, we focus on the issue of automatic speaker verification (ASV) under several levels of room reverberation. Alternative auditory inspired features are explored. Specifically, we investigate whether the performance of the so-called modulation spectral features (MSFs) can overcome the well-known mel-frequency cepstral coefficients (MFCCs). Experiments were conducted with an ASV system based on the state-of-the-art i-vector. The main contribution of this paper is to verify if MSFs combined with i-vectors are able to present the same performance encountered in the literature regarding speech recognition and speaker identification systems in reverberant environment.\",\"PeriodicalId\":359348,\"journal\":{\"name\":\"2014 International Telecommunications Symposium (ITS)\",\"volume\":\"97 1\",\"pages\":\"0\"},\"PeriodicalIF\":0.0000,\"publicationDate\":\"2014-11-06\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"1\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"2014 International Telecommunications Symposium (ITS)\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/10.1109/ITS.2014.6948012\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"2014 International Telecommunications Symposium (ITS)","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/ITS.2014.6948012","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
Investigating the use of modulation spectral features within an i-vector framework for far-field automatic speaker verification
It is known that channel variability compromises automatic speaker recognition accuracy. However, little attention has been given so far to the detrimental effects encountered under reverberant environments. In this paper, we focus on the issue of automatic speaker verification (ASV) under several levels of room reverberation. Alternative auditory inspired features are explored. Specifically, we investigate whether the performance of the so-called modulation spectral features (MSFs) can overcome the well-known mel-frequency cepstral coefficients (MFCCs). Experiments were conducted with an ASV system based on the state-of-the-art i-vector. The main contribution of this paper is to verify if MSFs combined with i-vectors are able to present the same performance encountered in the literature regarding speech recognition and speaker identification systems in reverberant environment.