{"title":"LSP特征在文本无关说话人识别中的有效性","authors":"S. V. Chougule, M. Chavan","doi":"10.1109/CCCS.2015.7374210","DOIUrl":null,"url":null,"abstract":"The speech features used for speaker recognition should uniquely reflect characteristics of the speaker's vocal tract apparatus and contain negligible information about the linguistic contents in the speech. Cepstral features such as Linear Predictive Spectral Coefficients (LPCCs) and Mel Frequency Cepstral Coefficients (MFCCs) are most commonly used features for speaker recognition task, but found to be sensitive to noise and distortion. Other complementary features used initially for speech recognition can be found useful for speaker recognition task. In this work, Line Spectral Pair (LSP) features (derived from baseline linear predictive coefficients) are used for text independent speaker identification. In LSP features, power spectral density at any frequency tends to depend only on close to the respective LSP. In contrast, for cepstral features, changes in particular parameter affects the whole spectrum. The goal here is to investigate the performance of line spectral pair (LSP) features against conventional cepstral features in the presence of acoustic disturbance. Experimentation is carried out using TIMIT and NTIMIT dataset to analyze the performance in case of acoustic and channel distortions. It is observed that the LSP features perform equally well to conventional cepstral features on TIMIT dataset and have showed enhanced identification results on NTIMIT datasets.","PeriodicalId":300052,"journal":{"name":"2015 International Conference on Computing, Communication and Security (ICCCS)","volume":"20 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2015-12-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"1","resultStr":"{\"title\":\"Effectiveness of LSP features for text independent speaker identification\",\"authors\":\"S. V. Chougule, M. Chavan\",\"doi\":\"10.1109/CCCS.2015.7374210\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"The speech features used for speaker recognition should uniquely reflect characteristics of the speaker's vocal tract apparatus and contain negligible information about the linguistic contents in the speech. Cepstral features such as Linear Predictive Spectral Coefficients (LPCCs) and Mel Frequency Cepstral Coefficients (MFCCs) are most commonly used features for speaker recognition task, but found to be sensitive to noise and distortion. Other complementary features used initially for speech recognition can be found useful for speaker recognition task. In this work, Line Spectral Pair (LSP) features (derived from baseline linear predictive coefficients) are used for text independent speaker identification. In LSP features, power spectral density at any frequency tends to depend only on close to the respective LSP. In contrast, for cepstral features, changes in particular parameter affects the whole spectrum. The goal here is to investigate the performance of line spectral pair (LSP) features against conventional cepstral features in the presence of acoustic disturbance. Experimentation is carried out using TIMIT and NTIMIT dataset to analyze the performance in case of acoustic and channel distortions. It is observed that the LSP features perform equally well to conventional cepstral features on TIMIT dataset and have showed enhanced identification results on NTIMIT datasets.\",\"PeriodicalId\":300052,\"journal\":{\"name\":\"2015 International Conference on Computing, Communication and Security (ICCCS)\",\"volume\":\"20 1\",\"pages\":\"0\"},\"PeriodicalIF\":0.0000,\"publicationDate\":\"2015-12-01\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"1\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"2015 International Conference on Computing, Communication and Security (ICCCS)\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/10.1109/CCCS.2015.7374210\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"2015 International Conference on Computing, Communication and Security (ICCCS)","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/CCCS.2015.7374210","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
Effectiveness of LSP features for text independent speaker identification
The speech features used for speaker recognition should uniquely reflect characteristics of the speaker's vocal tract apparatus and contain negligible information about the linguistic contents in the speech. Cepstral features such as Linear Predictive Spectral Coefficients (LPCCs) and Mel Frequency Cepstral Coefficients (MFCCs) are most commonly used features for speaker recognition task, but found to be sensitive to noise and distortion. Other complementary features used initially for speech recognition can be found useful for speaker recognition task. In this work, Line Spectral Pair (LSP) features (derived from baseline linear predictive coefficients) are used for text independent speaker identification. In LSP features, power spectral density at any frequency tends to depend only on close to the respective LSP. In contrast, for cepstral features, changes in particular parameter affects the whole spectrum. The goal here is to investigate the performance of line spectral pair (LSP) features against conventional cepstral features in the presence of acoustic disturbance. Experimentation is carried out using TIMIT and NTIMIT dataset to analyze the performance in case of acoustic and channel distortions. It is observed that the LSP features perform equally well to conventional cepstral features on TIMIT dataset and have showed enhanced identification results on NTIMIT datasets.