A Linguistic-based Transfer Learning Approach for Low-resource Bahnar Text-to-Speech

T. Nguyen, Quang Tuong Lam, D. Do, Huu Thuc Cai, Hoang Suong Nguyen, Thanh Hung Vo, Duc Dung Nguyen
{"title":"A Linguistic-based Transfer Learning Approach for Low-resource Bahnar Text-to-Speech","authors":"T. Nguyen, Quang Tuong Lam, D. Do, Huu Thuc Cai, Hoang Suong Nguyen, Thanh Hung Vo, Duc Dung Nguyen","doi":"10.1109/NICS56915.2022.10013451","DOIUrl":null,"url":null,"abstract":"The Text-to-Speech (TTS) model often requires a large number of recorded utterances in standard quality for a high-fidelity synthesized speech. For low-resource languages, lacking data becomes a big challenge. In this work, we address this problem in the Bahnar Kriem language, a rare language used by Bahnar people living in Binh Dinh county, Vietnam. We propose the linguistic approach to process a poor-quality dataset of 720 utterances of Bahnar Kriem language, along with some preprocessing steps. We also analyze the Bahnar Kriem language and figure out a mixture between Bahnar and Vietnamese due to the historical development between the two races. Therefore, we propose the transfer learning approach to integrate the Vietnamese pronunciation into the Bahnar TTS synthesizer. The experiments show significant improvement in the performance of the TTS model for a low-resource language. Our model can also generate long Bahnar sentences with a short inference time. The subjective and objective evaluations suggest promising results and some potential improvements based on our approach. We also provide audio samples generated from our model1.","PeriodicalId":381028,"journal":{"name":"2022 9th NAFOSTED Conference on Information and Computer Science (NICS)","volume":"24 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2022-10-31","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"2022 9th NAFOSTED Conference on Information and Computer Science (NICS)","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/NICS56915.2022.10013451","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 0

Abstract

The Text-to-Speech (TTS) model often requires a large number of recorded utterances in standard quality for a high-fidelity synthesized speech. For low-resource languages, lacking data becomes a big challenge. In this work, we address this problem in the Bahnar Kriem language, a rare language used by Bahnar people living in Binh Dinh county, Vietnam. We propose the linguistic approach to process a poor-quality dataset of 720 utterances of Bahnar Kriem language, along with some preprocessing steps. We also analyze the Bahnar Kriem language and figure out a mixture between Bahnar and Vietnamese due to the historical development between the two races. Therefore, we propose the transfer learning approach to integrate the Vietnamese pronunciation into the Bahnar TTS synthesizer. The experiments show significant improvement in the performance of the TTS model for a low-resource language. Our model can also generate long Bahnar sentences with a short inference time. The subjective and objective evaluations suggest promising results and some potential improvements based on our approach. We also provide audio samples generated from our model1.
基于语言的低资源巴纳尔语文转语音迁移学习方法
文本到语音(TTS)模型通常需要大量录制的标准质量的话语来实现高保真的合成语音。对于低资源的语言,缺乏数据成为一个巨大的挑战。在这项工作中,我们用巴纳尔克里姆语来解决这个问题,这是一种生活在越南平定县的巴纳尔人使用的罕见语言。我们提出了一种语言学方法来处理720个巴纳尔克里姆语的低质量数据集,以及一些预处理步骤。我们还对巴纳尔克里姆语进行了分析,并根据巴纳尔语和越南语的历史发展,得出了巴纳尔语和越南语的混合。因此,我们建议采用迁移学习的方法将越南语语音整合到Bahnar TTS合成器中。实验结果表明,对于低资源语言,TTS模型的性能有了显著提高。我们的模型还可以用较短的推理时间生成较长的Bahnar句子。主观和客观的评价表明,基于我们的方法,有希望的结果和一些潜在的改进。我们还提供了从model1生成的音频样本。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
求助全文
约1分钟内获得全文 求助全文
来源期刊
自引率
0.00%
发文量
0
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
确定
请完成安全验证×
copy
已复制链接
快去分享给好友吧!
我知道了
右上角分享
点击右上角分享
0
联系我们:info@booksci.cn Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。 Copyright © 2023 布克学术 All rights reserved.
京ICP备2023020795号-1
ghs 京公网安备 11010802042870号
Book学术文献互助
Book学术文献互助群
群 号:604180095
Book学术官方微信