{"title":"基于严重程度的ASR适应对发音困难说话者的帮助","authors":"B. Al-Qatab, Mumtaz Begum Mustafa, S. Salim","doi":"10.1109/AMS.2014.40","DOIUrl":null,"url":null,"abstract":"Automatic speech recognition (ASR) for dysarthric speakers is one of the most challenging research areas. The lack of corpus for dysarthric speakers makes it even more difficult. This paper introduces the Intra-Severity adaptation, using small amount of speech data, in which data from all participants in a given severity type will use for adaptation of that type. The adaptation is performed for two types of acoustic models, which are the Controlled Acoustic Model (CAM) developed using rich phonetic corpus, and Dysarthric Acoustic Model (DAM) that includes speech collected from dysarthric speakers suffering from variety level of severity. This paper compares two adaptation techniques for building ASR systems for dysarthric speakers, which are Maximum Likelihood Linear Regression (MLLR) and Constrained Maximum Likelihood Linear Regression (CMLLR).The result shows that the Word Recognition Accuracy (WRA) for the CAM outperformed DAM for both the Speaker Independent (SI) and Speaker Adaptation (SA). On the other hand, it was found that MLLR is outperformed the CMLLR for both Controlled Speaker Adaptation (CSA) and Dysarthric Speaker Adaptation (DSA).","PeriodicalId":198621,"journal":{"name":"2014 8th Asia Modelling Symposium","volume":"24 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2014-09-23","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"2","resultStr":"{\"title\":\"Severity Based Adaptation for ASR to Aid Dysarthric Speakers\",\"authors\":\"B. Al-Qatab, Mumtaz Begum Mustafa, S. Salim\",\"doi\":\"10.1109/AMS.2014.40\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"Automatic speech recognition (ASR) for dysarthric speakers is one of the most challenging research areas. The lack of corpus for dysarthric speakers makes it even more difficult. This paper introduces the Intra-Severity adaptation, using small amount of speech data, in which data from all participants in a given severity type will use for adaptation of that type. The adaptation is performed for two types of acoustic models, which are the Controlled Acoustic Model (CAM) developed using rich phonetic corpus, and Dysarthric Acoustic Model (DAM) that includes speech collected from dysarthric speakers suffering from variety level of severity. This paper compares two adaptation techniques for building ASR systems for dysarthric speakers, which are Maximum Likelihood Linear Regression (MLLR) and Constrained Maximum Likelihood Linear Regression (CMLLR).The result shows that the Word Recognition Accuracy (WRA) for the CAM outperformed DAM for both the Speaker Independent (SI) and Speaker Adaptation (SA). On the other hand, it was found that MLLR is outperformed the CMLLR for both Controlled Speaker Adaptation (CSA) and Dysarthric Speaker Adaptation (DSA).\",\"PeriodicalId\":198621,\"journal\":{\"name\":\"2014 8th Asia Modelling Symposium\",\"volume\":\"24 1\",\"pages\":\"0\"},\"PeriodicalIF\":0.0000,\"publicationDate\":\"2014-09-23\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"2\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"2014 8th Asia Modelling Symposium\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/10.1109/AMS.2014.40\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"2014 8th Asia Modelling Symposium","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/AMS.2014.40","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
Severity Based Adaptation for ASR to Aid Dysarthric Speakers
Automatic speech recognition (ASR) for dysarthric speakers is one of the most challenging research areas. The lack of corpus for dysarthric speakers makes it even more difficult. This paper introduces the Intra-Severity adaptation, using small amount of speech data, in which data from all participants in a given severity type will use for adaptation of that type. The adaptation is performed for two types of acoustic models, which are the Controlled Acoustic Model (CAM) developed using rich phonetic corpus, and Dysarthric Acoustic Model (DAM) that includes speech collected from dysarthric speakers suffering from variety level of severity. This paper compares two adaptation techniques for building ASR systems for dysarthric speakers, which are Maximum Likelihood Linear Regression (MLLR) and Constrained Maximum Likelihood Linear Regression (CMLLR).The result shows that the Word Recognition Accuracy (WRA) for the CAM outperformed DAM for both the Speaker Independent (SI) and Speaker Adaptation (SA). On the other hand, it was found that MLLR is outperformed the CMLLR for both Controlled Speaker Adaptation (CSA) and Dysarthric Speaker Adaptation (DSA).