{"title":"Disordered speech quality estimation using linear prediction","authors":"Y. S. E. Ali, V. Parsa, P. Doyle, S. Berkane","doi":"10.1109/PACRIM.2017.8121897","DOIUrl":null,"url":null,"abstract":"Tracheoesophageal (TE) speech is generated by patients who have undergone a total laryngectomy where the larynx (voice box) is removed and replaced by a tracheoesophageal puncture. This work presents a novel low complexity algorithm to estimate the degree of severity of disordered TE speech. The proposed algorithm uses features which are computed from 32-ms voiced frames of the speech signal. A 21-st order LPC analysis is performed on each voiced frame of the speech and high order statistics (central moments: mean, standard deviation, skewness and kurtosis) are extracted from the LPC coefficients, Cepstral coefficients and the LPC residual signal. The averages of each of these moments are computed along with the pitch average over all voiced frames yielding a total of 14 quality features. Experimental results with two sets of databases (20 and 35 TE speakers) showed that the proposed speech quality estimation approach performs well with a correlation with subjective scores in the range between 0.81 and 0.86.","PeriodicalId":308087,"journal":{"name":"2017 IEEE Pacific Rim Conference on Communications, Computers and Signal Processing (PACRIM)","volume":"75 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2017-08-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"2017 IEEE Pacific Rim Conference on Communications, Computers and Signal Processing (PACRIM)","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/PACRIM.2017.8121897","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 0
Abstract
Tracheoesophageal (TE) speech is generated by patients who have undergone a total laryngectomy where the larynx (voice box) is removed and replaced by a tracheoesophageal puncture. This work presents a novel low complexity algorithm to estimate the degree of severity of disordered TE speech. The proposed algorithm uses features which are computed from 32-ms voiced frames of the speech signal. A 21-st order LPC analysis is performed on each voiced frame of the speech and high order statistics (central moments: mean, standard deviation, skewness and kurtosis) are extracted from the LPC coefficients, Cepstral coefficients and the LPC residual signal. The averages of each of these moments are computed along with the pitch average over all voiced frames yielding a total of 14 quality features. Experimental results with two sets of databases (20 and 35 TE speakers) showed that the proposed speech quality estimation approach performs well with a correlation with subjective scores in the range between 0.81 and 0.86.