Speaker Identification System Under Noisy Conditions

2019 5th International Conference on Advances in Electrical Engineering (ICAEE) Pub Date : 2019-09-01 DOI:10.1109/icaee48663.2019.8975420

Md. Shariful Alam, M. S. Zilany

{"title":"Speaker Identification System Under Noisy Conditions","authors":"Md. Shariful Alam, M. S. Zilany","doi":"10.1109/icaee48663.2019.8975420","DOIUrl":null,"url":null,"abstract":"Speaker identification (SID) systems need to be robust to extrinsic variations in the speech signal, such as background noise, to be applicable in many real-life scenarios. Mel-frequency cepstral coefficient (MFCC)-based i-vector systems have been defined as the state-of-the-art technique for speaker identification, but it is well-known that the performance of traditional methods, in which features are mostly extracted from the properties of the acoustic signal, degrades substantially under noisy conditions. This study proposes a robust SID system using the neural responses of a physiologically-based computational model of the auditory periphery. The 2-D neurograms were constructed from the simulated responses of the auditory-nerve fibers to speech signals from the TIMIT database. The neurogram coefficients were trained using the i-vector based systems to generate an identity model for each speaker, and performances were evaluated and compared in quiet and under noisy conditions with the results from existing methods such as the MFCC, frequency-domain linear prediction (FDLP) and Gammatone frequency cepstral coefficient (GFCC). Results showed that the proposed system outperformed all existing acoustic-signal-based methods for both in quiet and under noisy conditions.","PeriodicalId":138634,"journal":{"name":"2019 5th International Conference on Advances in Electrical Engineering (ICAEE)","volume":"51 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2019-09-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"2019 5th International Conference on Advances in Electrical Engineering (ICAEE)","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/icaee48663.2019.8975420","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}

引用次数: 0

Abstract

Speaker identification (SID) systems need to be robust to extrinsic variations in the speech signal, such as background noise, to be applicable in many real-life scenarios. Mel-frequency cepstral coefficient (MFCC)-based i-vector systems have been defined as the state-of-the-art technique for speaker identification, but it is well-known that the performance of traditional methods, in which features are mostly extracted from the properties of the acoustic signal, degrades substantially under noisy conditions. This study proposes a robust SID system using the neural responses of a physiologically-based computational model of the auditory periphery. The 2-D neurograms were constructed from the simulated responses of the auditory-nerve fibers to speech signals from the TIMIT database. The neurogram coefficients were trained using the i-vector based systems to generate an identity model for each speaker, and performances were evaluated and compared in quiet and under noisy conditions with the results from existing methods such as the MFCC, frequency-domain linear prediction (FDLP) and Gammatone frequency cepstral coefficient (GFCC). Results showed that the proposed system outperformed all existing acoustic-signal-based methods for both in quiet and under noisy conditions.

查看原文本刊更多论文

噪声条件下的说话人识别系统

说话人识别(SID)系统需要对语音信号的外部变化(如背景噪声)具有鲁棒性，才能适用于许多实际场景。基于mel频率倒谱系数(MFCC)的i向量系统被定义为最先进的说话人识别技术，但众所周知，传统方法的性能在噪声条件下大幅下降，其中特征主要是从声信号的特性中提取的。本研究提出了一个强大的SID系统，使用听觉外围的基于生理的计算模型的神经反应。二维神经图是由听觉神经纤维对来自TIMIT数据库的语音信号的模拟反应构建的。使用基于i向量的系统对神经图系数进行训练，为每个说话者生成身份模型，并在安静和嘈杂条件下与现有方法(如MFCC，频域线性预测(FDLP)和Gammatone频率倒谱系数(GFCC))的结果进行评估和比较。结果表明，该系统在安静和噪声条件下都优于所有现有的基于声信号的方法。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

求助全文

约1分钟内获得全文求助全文

来源期刊

2019 5th International Conference on Advances in Electrical Engineering (ICAEE)

自引率

0.00%

发文量