Improving Automatic Evaluation of Mandarin Pronunciation with Speaker Adaptive Training (SAT) and MLLR Speaker Adaption

Chao Huang, Feng Zhang, F. Soong
{"title":"Improving Automatic Evaluation of Mandarin Pronunciation with Speaker Adaptive Training (SAT) and MLLR Speaker Adaption","authors":"Chao Huang, Feng Zhang, F. Soong","doi":"10.1109/CHINSL.2008.ECP.21","DOIUrl":null,"url":null,"abstract":"Automatic pronunciation evaluation (APE) can be implemented with a speech recognition model trained by standard, \"golden\" speakers. The pronunciation accuracy is then measured with the Goodness of Pronunciation (GOP) as reported in our earlier work [1]. In this paper, we investigate two main strategies for improving the evaluation: speaker adaptive training (SAT) for reducing the speaker-specific characteristics in model training and MLLR-based speaker adaptation in evaluation for reducing mismatch between the trained model and a testing speaker. Overall, the proposed strategies improve the correlation between evaluations made by APE and human experts from 0.69 to 0.76, approaching the upper bound value of 0.78 among human expert evaluators. Additionally, APE also shows a consistency of 0.93 better than the consistency of 0.83 among human experts.","PeriodicalId":291958,"journal":{"name":"2008 6th International Symposium on Chinese Spoken Language Processing","volume":"26 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2008-12-30","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"8","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"2008 6th International Symposium on Chinese Spoken Language Processing","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/CHINSL.2008.ECP.21","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 8

Abstract

Automatic pronunciation evaluation (APE) can be implemented with a speech recognition model trained by standard, "golden" speakers. The pronunciation accuracy is then measured with the Goodness of Pronunciation (GOP) as reported in our earlier work [1]. In this paper, we investigate two main strategies for improving the evaluation: speaker adaptive training (SAT) for reducing the speaker-specific characteristics in model training and MLLR-based speaker adaptation in evaluation for reducing mismatch between the trained model and a testing speaker. Overall, the proposed strategies improve the correlation between evaluations made by APE and human experts from 0.69 to 0.76, approaching the upper bound value of 0.78 among human expert evaluators. Additionally, APE also shows a consistency of 0.93 better than the consistency of 0.83 among human experts.
用说话者自适应训练(SAT)和MLLR说话者自适应改进普通话语音自动评价
自动发音评估(APE)可以通过标准的“黄金”说话者训练的语音识别模型来实现。然后用我们之前的工作[1]中报道的发音优度(GOP)来衡量发音精度。在本文中,我们研究了两种改进评估的主要策略:在模型训练中使用说话人自适应训练(SAT)来减少说话人特定特征;在评估中使用基于mllr的说话人自适应来减少训练模型与测试说话人之间的不匹配。总体而言,所提出的策略将APE与人类专家评价的相关性从0.69提高到0.76,接近人类专家评价的上界值0.78。此外,APE的一致性为0.93,优于人类专家的一致性0.83。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
求助全文
约1分钟内获得全文 求助全文
来源期刊
自引率
0.00%
发文量
0
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
确定
请完成安全验证×
copy
已复制链接
快去分享给好友吧!
我知道了
右上角分享
点击右上角分享
0
联系我们:info@booksci.cn Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。 Copyright © 2023 布克学术 All rights reserved.
京ICP备2023020795号-1
ghs 京公网安备 11010802042870号
Book学术文献互助
Book学术文献互助群
群 号:604180095
Book学术官方微信