{"title":"Significance of the LP-MVDR spectral ratio method in Whisper Detection","authors":"Arpit Mathur, R. Hegde","doi":"10.1109/NCC.2011.5734719","DOIUrl":null,"url":null,"abstract":"A new spectral ratio method is proposed in this paper for detecting whispered segments within a normally phonated speech stream. The method is based on computing the ratio of the linear Prediction(LP) spectrum to the minimum variance distortion less response (MVDR) spectrum. Both the linear prediction method and the LP residual method by themselves are found to be inadequate in modelling medium to high frequencies in the speech signal. On the contrary, the MVDR method shows robustness in modelling spectra of all frequencies. This difference in spectral estimation between the two is utilized in the proposed spectral ratio method to separate whispered segments having less harmonics and more noise from normally phonated segments of speech. A comparative analysis of the proposed method with other methods like the LP residual and the spectral flatness methods is described. Whisper Detection experiments are conducted on the CHAINS database. The proposed method indicates reasonable improvements as noted from the ROC curves and the whisper diarization error rate.","PeriodicalId":158295,"journal":{"name":"2011 National Conference on Communications (NCC)","volume":"45 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2011-03-17","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"4","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"2011 National Conference on Communications (NCC)","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/NCC.2011.5734719","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 4
Abstract
A new spectral ratio method is proposed in this paper for detecting whispered segments within a normally phonated speech stream. The method is based on computing the ratio of the linear Prediction(LP) spectrum to the minimum variance distortion less response (MVDR) spectrum. Both the linear prediction method and the LP residual method by themselves are found to be inadequate in modelling medium to high frequencies in the speech signal. On the contrary, the MVDR method shows robustness in modelling spectra of all frequencies. This difference in spectral estimation between the two is utilized in the proposed spectral ratio method to separate whispered segments having less harmonics and more noise from normally phonated segments of speech. A comparative analysis of the proposed method with other methods like the LP residual and the spectral flatness methods is described. Whisper Detection experiments are conducted on the CHAINS database. The proposed method indicates reasonable improvements as noted from the ROC curves and the whisper diarization error rate.