一种新的口语教学技术:多注意与重复相结合的一次性跨语言语音转换

Dengfeng Ke, Wenhan Yao, Ruixin Hu, Liangjie Huang, Qi Luo, Wentao Shu
{"title":"一种新的口语教学技术:多注意与重复相结合的一次性跨语言语音转换","authors":"Dengfeng Ke, Wenhan Yao, Ruixin Hu, Liangjie Huang, Qi Luo, Wentao Shu","doi":"10.1109/ISCSLP57327.2022.10038137","DOIUrl":null,"url":null,"abstract":"Computer aided pronunciation training(CAPT) plays an important role in oral language teaching. The main methods of traditional computer-assisted oral teaching include mispronunciation detection and pronunciation scoring and assessment.However, these two techniques only give negative feedback information such as scores or error categories. In this case,it is difficult for learners to refine their pronunciation through these two indicators without the guidance of correct speech.To tackle this problem, we proposed a cross language voice conversion(VC) framework that can generate speech with template speech content and learners’ own timbre,which can guide the learner’s pronunciation.To improve VC effect,we apply AdaIN in the fore-end and after the Value matrix in multi-head attention once respectively,called attention-AdaIN,which can improve the style transfer and sequence generation ability.We used attention-AdaIN to construct VC framework based on VAE.Experiments conducted on the AISHELL-3 and VCTK corpus showed that this new aprroach improved the baseline VAE-VC.","PeriodicalId":246698,"journal":{"name":"2022 13th International Symposium on Chinese Spoken Language Processing (ISCSLP)","volume":"53 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2022-12-11","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":"{\"title\":\"A New Spoken Language Teaching Tech: Combining Multi-attention and AdaIN for One-shot Cross Language Voice Conversion\",\"authors\":\"Dengfeng Ke, Wenhan Yao, Ruixin Hu, Liangjie Huang, Qi Luo, Wentao Shu\",\"doi\":\"10.1109/ISCSLP57327.2022.10038137\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"Computer aided pronunciation training(CAPT) plays an important role in oral language teaching. The main methods of traditional computer-assisted oral teaching include mispronunciation detection and pronunciation scoring and assessment.However, these two techniques only give negative feedback information such as scores or error categories. In this case,it is difficult for learners to refine their pronunciation through these two indicators without the guidance of correct speech.To tackle this problem, we proposed a cross language voice conversion(VC) framework that can generate speech with template speech content and learners’ own timbre,which can guide the learner’s pronunciation.To improve VC effect,we apply AdaIN in the fore-end and after the Value matrix in multi-head attention once respectively,called attention-AdaIN,which can improve the style transfer and sequence generation ability.We used attention-AdaIN to construct VC framework based on VAE.Experiments conducted on the AISHELL-3 and VCTK corpus showed that this new aprroach improved the baseline VAE-VC.\",\"PeriodicalId\":246698,\"journal\":{\"name\":\"2022 13th International Symposium on Chinese Spoken Language Processing (ISCSLP)\",\"volume\":\"53 1\",\"pages\":\"0\"},\"PeriodicalIF\":0.0000,\"publicationDate\":\"2022-12-11\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"0\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"2022 13th International Symposium on Chinese Spoken Language Processing (ISCSLP)\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/10.1109/ISCSLP57327.2022.10038137\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"2022 13th International Symposium on Chinese Spoken Language Processing (ISCSLP)","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/ISCSLP57327.2022.10038137","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 0

摘要

计算机辅助发音训练(CAPT)在口语教学中起着重要作用。传统计算机辅助口语教学的主要方法包括语音错误检测和语音评分与评价。然而,这两种技术只提供负面反馈信息,如分数或错误类别。在这种情况下,如果没有正确语音的指导,学习者很难通过这两个指标来完善自己的发音。为了解决这一问题,我们提出了一个跨语言语音转换(VC)框架,该框架可以使用模板语音内容和学习者自己的音色生成语音,从而指导学习者的发音。为了提高VC效果,我们在多头注意的Value矩阵的前端和之后分别应用一次AdaIN,称为attention-AdaIN,可以提高风格迁移和序列生成能力。我们利用attention-AdaIN构建了基于VAE的VC框架。在ahell -3和VCTK语料库上进行的实验表明,该方法提高了基线VAE-VC。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
A New Spoken Language Teaching Tech: Combining Multi-attention and AdaIN for One-shot Cross Language Voice Conversion
Computer aided pronunciation training(CAPT) plays an important role in oral language teaching. The main methods of traditional computer-assisted oral teaching include mispronunciation detection and pronunciation scoring and assessment.However, these two techniques only give negative feedback information such as scores or error categories. In this case,it is difficult for learners to refine their pronunciation through these two indicators without the guidance of correct speech.To tackle this problem, we proposed a cross language voice conversion(VC) framework that can generate speech with template speech content and learners’ own timbre,which can guide the learner’s pronunciation.To improve VC effect,we apply AdaIN in the fore-end and after the Value matrix in multi-head attention once respectively,called attention-AdaIN,which can improve the style transfer and sequence generation ability.We used attention-AdaIN to construct VC framework based on VAE.Experiments conducted on the AISHELL-3 and VCTK corpus showed that this new aprroach improved the baseline VAE-VC.
求助全文
通过发布文献求助,成功后即可免费获取论文全文。 去求助
来源期刊
自引率
0.00%
发文量
0
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
确定
请完成安全验证×
copy
已复制链接
快去分享给好友吧!
我知道了
右上角分享
点击右上角分享
0
联系我们:info@booksci.cn Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。 Copyright © 2023 布克学术 All rights reserved.
京ICP备2023020795号-1
ghs 京公网安备 11010802042870号
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术官方微信