Zero-Shot Foreign Accent Conversion without a Native Reference

Waris Quamer, Anurag Das, John M. Levis, E. Chukharev-Hudilainen, R. Gutierrez-Osuna
{"title":"Zero-Shot Foreign Accent Conversion without a Native Reference","authors":"Waris Quamer, Anurag Das, John M. Levis, E. Chukharev-Hudilainen, R. Gutierrez-Osuna","doi":"10.21437/interspeech.2022-10664","DOIUrl":null,"url":null,"abstract":"Previous approaches for foreign accent conversion (FAC) ei-ther need a reference utterance from a native speaker (L1) during synthesis, or are dedicated one-to-one systems that must be trained separately for each non-native (L2) speaker. To address both issues, we propose a new FAC system that can transform L2 speech directly from previously unseen speakers. The system consists of two independent modules: a translator and a synthesizer, which operate on bottleneck features derived from phonetic posteriorgrams. The translator is trained to map bottleneck features in L2 utterances into those from a parallel L1 utterance. The synthesizer is a many-to-many system that maps input bottleneck features into the corresponding Mel-spectrograms, conditioned on an embedding from the L2 speaker. During inference, both modules operate in sequence to take an unseen L2 utterance and generate a native-accented Mel-spectrogram. Perceptual experiments show that our system achieves a large reduction (67%) in non-native accentedness compared to a state-of-the-art reference-free system (28.9%) that builds a dedicated model for each L2 speaker. Moreover, 80% of the listeners rated the synthesized utterances to have the same voice identity as the L2 speaker.","PeriodicalId":73500,"journal":{"name":"Interspeech","volume":"1 1","pages":"4920-4924"},"PeriodicalIF":0.0000,"publicationDate":"2022-09-18","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"5","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Interspeech","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.21437/interspeech.2022-10664","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 5

Abstract

Previous approaches for foreign accent conversion (FAC) ei-ther need a reference utterance from a native speaker (L1) during synthesis, or are dedicated one-to-one systems that must be trained separately for each non-native (L2) speaker. To address both issues, we propose a new FAC system that can transform L2 speech directly from previously unseen speakers. The system consists of two independent modules: a translator and a synthesizer, which operate on bottleneck features derived from phonetic posteriorgrams. The translator is trained to map bottleneck features in L2 utterances into those from a parallel L1 utterance. The synthesizer is a many-to-many system that maps input bottleneck features into the corresponding Mel-spectrograms, conditioned on an embedding from the L2 speaker. During inference, both modules operate in sequence to take an unseen L2 utterance and generate a native-accented Mel-spectrogram. Perceptual experiments show that our system achieves a large reduction (67%) in non-native accentedness compared to a state-of-the-art reference-free system (28.9%) that builds a dedicated model for each L2 speaker. Moreover, 80% of the listeners rated the synthesized utterances to have the same voice identity as the L2 speaker.
没有本机引用的零样本外来重音转换
先前的外国口音转换(FAC)方法在合成过程中还需要来自母语(L1)的参考话语,或者是必须为每个非母语(L2)说话者单独训练的专用一对一系统。为了解决这两个问题,我们提出了一种新的FAC系统,它可以直接转换以前看不见的说话者的L2语音。该系统由两个独立的模块组成:翻译器和合成器,它们对语音后验图中的瓶颈特征进行操作。翻译者被训练为将L2话语中的瓶颈特征映射为来自平行L1话语的瓶颈特征。合成器是一个多对多系统,它将输入瓶颈特征映射到相应的Mel声谱图中,条件是来自L2扬声器的嵌入。在推理过程中,两个模块依次操作,以获取一个看不见的L2话语,并生成一个带有本地口音的梅尔声谱图。感知实验表明,与为每个L2说话者建立专用模型的最先进的无参考系统(28.9%)相比,我们的系统在非母语重音方面实现了大幅降低(67%)。此外,80%的听众认为合成的话语与L2说话者具有相同的语音身份。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
求助全文
约1分钟内获得全文 求助全文
来源期刊
自引率
0.00%
发文量
0
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
确定
请完成安全验证×
copy
已复制链接
快去分享给好友吧!
我知道了
右上角分享
点击右上角分享
0
联系我们:info@booksci.cn Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。 Copyright © 2023 布克学术 All rights reserved.
京ICP备2023020795号-1
ghs 京公网安备 11010802042870号
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术官方微信