{"title":"Customized Speaker Verification System with Noise-Cancellation using Blind Source Separation","authors":"Tsung-Han Tsai, Ping-Cheng Hao, Fong-Lin Tsai","doi":"10.1109/ICCE-Taiwan55306.2022.9869211","DOIUrl":null,"url":null,"abstract":"In this paper, a customized speaker verification system combined with noise-cancellation using blind source separation was proposed. This system is divided into two phases: the noise-cancellation phase and the speaker verification phase. In the noise-cancellation phase, a fast time-frequency mask technique based on Short Time Fourier Transform (STFT) was proposed for separating a mixture of two input sounds in a single signal. After obtaining the separated speech data, this input is processed to the wake-up word system. In the speaker verification phase, we use Mel-Frequency Cepstral Coefficients (MFCC) as the feature extraction module. Then we train the feature data into a voiceprint model and a state sequence model of the speaker using Gaussian mixture model (GMM) and hidden Markov model (HMM), respectively. An analysis is done on noisy speech signals corrupted by white noise at different angles. Based on the output SIR (Signal to Interference Ratio) and SDR (Signal to Distortion Ratio) analysis, the improved accuracy is derived in the proposed system. We have obtained promising results in the real experimental environment.","PeriodicalId":164671,"journal":{"name":"2022 IEEE International Conference on Consumer Electronics - Taiwan","volume":"67 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2022-07-06","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"2022 IEEE International Conference on Consumer Electronics - Taiwan","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/ICCE-Taiwan55306.2022.9869211","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 0
Abstract
In this paper, a customized speaker verification system combined with noise-cancellation using blind source separation was proposed. This system is divided into two phases: the noise-cancellation phase and the speaker verification phase. In the noise-cancellation phase, a fast time-frequency mask technique based on Short Time Fourier Transform (STFT) was proposed for separating a mixture of two input sounds in a single signal. After obtaining the separated speech data, this input is processed to the wake-up word system. In the speaker verification phase, we use Mel-Frequency Cepstral Coefficients (MFCC) as the feature extraction module. Then we train the feature data into a voiceprint model and a state sequence model of the speaker using Gaussian mixture model (GMM) and hidden Markov model (HMM), respectively. An analysis is done on noisy speech signals corrupted by white noise at different angles. Based on the output SIR (Signal to Interference Ratio) and SDR (Signal to Distortion Ratio) analysis, the improved accuracy is derived in the proposed system. We have obtained promising results in the real experimental environment.