{"title":"一种用于噪声条件下稳健说话人识别的混合前端","authors":"El Bachir Tazi, Noureddine El Makhfi","doi":"10.1109/INTELLISYS.2017.8324215","DOIUrl":null,"url":null,"abstract":"The automatic speaker identification systems provide acceptable performances when they are used with clean speech. However they become practically unstable when they operate in noisy environments. So the robustness of these systems remains a delicate research problem. We study a novel hybrid features extractor based on a combination of robust Relative Spectral Transform Perceptual Linear Prediction (RASTA-PLP) method and the conventional Mel Frequency Cepstral Coefficients (MFCC). We show the experiments carried out on a database corresponding to a population of 51 speakers, with a system entrained on clean speech and the test data degraded by an additive white Gaussian noise of SNR level variable from 40 db to 0 db that the proposed hybrid front-end using MFCC parameters combined with those of RASTA-PLP in the same feature vector gives better results compared to those obtained using separating these previous methods. An improvement accuracy of about 3.38% was observed by comparison to the base line method MFCC.","PeriodicalId":131825,"journal":{"name":"2017 Intelligent Systems Conference (IntelliSys)","volume":"42 4","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2017-09-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"6","resultStr":"{\"title\":\"An hybrid front-end for robust speaker identification under noisy conditions\",\"authors\":\"El Bachir Tazi, Noureddine El Makhfi\",\"doi\":\"10.1109/INTELLISYS.2017.8324215\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"The automatic speaker identification systems provide acceptable performances when they are used with clean speech. However they become practically unstable when they operate in noisy environments. So the robustness of these systems remains a delicate research problem. We study a novel hybrid features extractor based on a combination of robust Relative Spectral Transform Perceptual Linear Prediction (RASTA-PLP) method and the conventional Mel Frequency Cepstral Coefficients (MFCC). We show the experiments carried out on a database corresponding to a population of 51 speakers, with a system entrained on clean speech and the test data degraded by an additive white Gaussian noise of SNR level variable from 40 db to 0 db that the proposed hybrid front-end using MFCC parameters combined with those of RASTA-PLP in the same feature vector gives better results compared to those obtained using separating these previous methods. An improvement accuracy of about 3.38% was observed by comparison to the base line method MFCC.\",\"PeriodicalId\":131825,\"journal\":{\"name\":\"2017 Intelligent Systems Conference (IntelliSys)\",\"volume\":\"42 4\",\"pages\":\"0\"},\"PeriodicalIF\":0.0000,\"publicationDate\":\"2017-09-01\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"6\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"2017 Intelligent Systems Conference (IntelliSys)\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/10.1109/INTELLISYS.2017.8324215\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"2017 Intelligent Systems Conference (IntelliSys)","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/INTELLISYS.2017.8324215","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
An hybrid front-end for robust speaker identification under noisy conditions
The automatic speaker identification systems provide acceptable performances when they are used with clean speech. However they become practically unstable when they operate in noisy environments. So the robustness of these systems remains a delicate research problem. We study a novel hybrid features extractor based on a combination of robust Relative Spectral Transform Perceptual Linear Prediction (RASTA-PLP) method and the conventional Mel Frequency Cepstral Coefficients (MFCC). We show the experiments carried out on a database corresponding to a population of 51 speakers, with a system entrained on clean speech and the test data degraded by an additive white Gaussian noise of SNR level variable from 40 db to 0 db that the proposed hybrid front-end using MFCC parameters combined with those of RASTA-PLP in the same feature vector gives better results compared to those obtained using separating these previous methods. An improvement accuracy of about 3.38% was observed by comparison to the base line method MFCC.