{"title":"Automatic Severity Evaluation of Articulation Disorder in Speech using Dynamic Time Warping","authors":"Devi Krishna B, L. Mary, A. George","doi":"10.1109/ICMSS53060.2021.9673654","DOIUrl":null,"url":null,"abstract":"Articulation disorder is a speech disorder condition where a person has an obstruction to correctly utter certain sounds. In the case of children, the early assessment and medical treatment for this disorder are most important. An Automatic Speech Recognition(ASR) based system has already been developed for assessment of articulation disorder by the Centre for Advanced Signal Processing (CASP), Department of Electronics, RIT Kottayam in collaboration with All India Institute of Speech and Hearing (AIISH), Mysore. Even though the classification of disordered speech into mild/moderate/severe categories are successfully done by the system, the numerical measure obtained for severe cases is not satisfactory. In this paper, an additional effective method to calculate a numerical measure of articulation disorder is proposed. The first method is to find out the similarity measure between articulation disordered words and their corresponding normal words by Dynamic Time Warping (DTW) of corresponding spectrogram images. Computation of Log Cepstral Distance (LCD) between disordered and normal speech is also done. Evaluation of the system is done using disordered speech in the collected dataset. It is found that spectrogram-based measure gives better results, and hence can be combined with the existing ASR-based system for better objective evaluation of articulation disorder.","PeriodicalId":274597,"journal":{"name":"2021 Fourth International Conference on Microelectronics, Signals & Systems (ICMSS)","volume":"182 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2021-11-18","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"2021 Fourth International Conference on Microelectronics, Signals & Systems (ICMSS)","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/ICMSS53060.2021.9673654","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 0
Abstract
Articulation disorder is a speech disorder condition where a person has an obstruction to correctly utter certain sounds. In the case of children, the early assessment and medical treatment for this disorder are most important. An Automatic Speech Recognition(ASR) based system has already been developed for assessment of articulation disorder by the Centre for Advanced Signal Processing (CASP), Department of Electronics, RIT Kottayam in collaboration with All India Institute of Speech and Hearing (AIISH), Mysore. Even though the classification of disordered speech into mild/moderate/severe categories are successfully done by the system, the numerical measure obtained for severe cases is not satisfactory. In this paper, an additional effective method to calculate a numerical measure of articulation disorder is proposed. The first method is to find out the similarity measure between articulation disordered words and their corresponding normal words by Dynamic Time Warping (DTW) of corresponding spectrogram images. Computation of Log Cepstral Distance (LCD) between disordered and normal speech is also done. Evaluation of the system is done using disordered speech in the collected dataset. It is found that spectrogram-based measure gives better results, and hence can be combined with the existing ASR-based system for better objective evaluation of articulation disorder.
发音障碍是一种语言障碍,一个人无法正确发出某些声音。就儿童而言,对这种疾病的早期评估和医疗是最重要的。RIT Kottayam电子系高级信号处理中心(CASP)与迈索尔全印度语言和听力研究所(AIISH)合作,已经开发了一种基于自动语音识别(ASR)的系统,用于评估发音障碍。尽管该系统成功地将语音障碍分为轻度/中度/重度三类,但对重度的数值测量结果并不令人满意。本文提出了一种额外的有效方法来计算发音障碍的数值度量。第一种方法是通过对相应的谱图图像进行动态时间扭曲(Dynamic Time Warping, DTW),找出发音紊乱词与其对应的正常词之间的相似度度量。计算了正常语音和无序语音之间的对数倒谱距离。使用收集到的数据集中的无序语音对系统进行评估。发现基于谱图的测量结果更好,因此可以与现有的基于asr的系统相结合,更好地客观评价发音障碍。