Fidalizia Pyrtuh, Sarfaraz Jelil, Geetima Kachari, L. J. Singh
{"title":"基于语音密码的说话人验证特征归一化技术的比较评价","authors":"Fidalizia Pyrtuh, Sarfaraz Jelil, Geetima Kachari, L. J. Singh","doi":"10.1109/NCVPRIPG.2013.6776237","DOIUrl":null,"url":null,"abstract":"This paper presents a comparative study of the normalization techniques used at feature level in voice password based speaker verification system. The input sample speech is recorded at different instants of time and environment. Hence, there is a variation in the input sample due to the environmental interference, noise, emotions etc. The input sample is a human voice with unique passwords taken/recorded at three different instants of time or day. This input sample is processed using sampling, pre-emphasis, MFCC feature extraction and DTW. In order to enhance the features we have used three different popular feature normalization techniques namely MVN (Mean and Variance Normalization), CMN (Cepstral Mean Normalization) and PCA(Principal Component Analysis) and analyzed the result of each technique individually. The objective of this paper is to compare the performance and efficiency of these techniques and evaluate which of these gives the best verification rate. According to our findings CMN gives the best results.","PeriodicalId":436402,"journal":{"name":"2013 Fourth National Conference on Computer Vision, Pattern Recognition, Image Processing and Graphics (NCVPRIPG)","volume":"78 6 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2013-12-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"5","resultStr":"{\"title\":\"Comparative evaluation of feature normalization techniques for voice password based speaker verification\",\"authors\":\"Fidalizia Pyrtuh, Sarfaraz Jelil, Geetima Kachari, L. J. Singh\",\"doi\":\"10.1109/NCVPRIPG.2013.6776237\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"This paper presents a comparative study of the normalization techniques used at feature level in voice password based speaker verification system. The input sample speech is recorded at different instants of time and environment. Hence, there is a variation in the input sample due to the environmental interference, noise, emotions etc. The input sample is a human voice with unique passwords taken/recorded at three different instants of time or day. This input sample is processed using sampling, pre-emphasis, MFCC feature extraction and DTW. In order to enhance the features we have used three different popular feature normalization techniques namely MVN (Mean and Variance Normalization), CMN (Cepstral Mean Normalization) and PCA(Principal Component Analysis) and analyzed the result of each technique individually. The objective of this paper is to compare the performance and efficiency of these techniques and evaluate which of these gives the best verification rate. According to our findings CMN gives the best results.\",\"PeriodicalId\":436402,\"journal\":{\"name\":\"2013 Fourth National Conference on Computer Vision, Pattern Recognition, Image Processing and Graphics (NCVPRIPG)\",\"volume\":\"78 6 1\",\"pages\":\"0\"},\"PeriodicalIF\":0.0000,\"publicationDate\":\"2013-12-01\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"5\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"2013 Fourth National Conference on Computer Vision, Pattern Recognition, Image Processing and Graphics (NCVPRIPG)\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/10.1109/NCVPRIPG.2013.6776237\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"2013 Fourth National Conference on Computer Vision, Pattern Recognition, Image Processing and Graphics (NCVPRIPG)","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/NCVPRIPG.2013.6776237","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
Comparative evaluation of feature normalization techniques for voice password based speaker verification
This paper presents a comparative study of the normalization techniques used at feature level in voice password based speaker verification system. The input sample speech is recorded at different instants of time and environment. Hence, there is a variation in the input sample due to the environmental interference, noise, emotions etc. The input sample is a human voice with unique passwords taken/recorded at three different instants of time or day. This input sample is processed using sampling, pre-emphasis, MFCC feature extraction and DTW. In order to enhance the features we have used three different popular feature normalization techniques namely MVN (Mean and Variance Normalization), CMN (Cepstral Mean Normalization) and PCA(Principal Component Analysis) and analyzed the result of each technique individually. The objective of this paper is to compare the performance and efficiency of these techniques and evaluate which of these gives the best verification rate. According to our findings CMN gives the best results.