{"title":"Dysfluent speech detection by image forensics techniques","authors":"Juraj Pálfy, Sakhia Darjaa, Jiri Pospíchal","doi":"10.1109/ASRU.2013.6707712","DOIUrl":null,"url":null,"abstract":"As speech recognition has become popular, the importance of dysfluency detection increased considerably. Once a dysfluent event in spontaneous speech is identified, the speech recognition performance could be enhanced by eliminating its negative effect. Most existing techniques to detect such dysfluent events are based on statistical models. Sparse regularity of dysfluent events and complexity to describe such events in a speech recognition system makes its recognition rigorous. These problems are addressed by our algorithm inspired by image forensics. This paper suggests our algorithm developed to extract novel features of complex dysfluencies. The common steps of classifier design were used to statistically evaluate the proposed features of complex dysfluencies in spectral and cepstral domains. Support vector machines perform objective assessment of MFCC features, MFCC based derived features, PCA based derived features and kernel PCA based derived features of complex dysfluencies, where our derived features increased the performance by 46% opposite to MFCC.","PeriodicalId":265258,"journal":{"name":"2013 IEEE Workshop on Automatic Speech Recognition and Understanding","volume":"23 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2013-12-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"1","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"2013 IEEE Workshop on Automatic Speech Recognition and Understanding","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/ASRU.2013.6707712","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 1
Abstract
As speech recognition has become popular, the importance of dysfluency detection increased considerably. Once a dysfluent event in spontaneous speech is identified, the speech recognition performance could be enhanced by eliminating its negative effect. Most existing techniques to detect such dysfluent events are based on statistical models. Sparse regularity of dysfluent events and complexity to describe such events in a speech recognition system makes its recognition rigorous. These problems are addressed by our algorithm inspired by image forensics. This paper suggests our algorithm developed to extract novel features of complex dysfluencies. The common steps of classifier design were used to statistically evaluate the proposed features of complex dysfluencies in spectral and cepstral domains. Support vector machines perform objective assessment of MFCC features, MFCC based derived features, PCA based derived features and kernel PCA based derived features of complex dysfluencies, where our derived features increased the performance by 46% opposite to MFCC.