基于稀疏门控机制的混合专家语音转换系统

Yu-Tao Chang, Yuan-Hong Yang, Yu-Huai Peng, Syu-Siang Wang, T. Chi, Yu Tsao, Hsin-Min Wang
{"title":"基于稀疏门控机制的混合专家语音转换系统","authors":"Yu-Tao Chang, Yuan-Hong Yang, Yu-Huai Peng, Syu-Siang Wang, T. Chi, Yu Tsao, Hsin-Min Wang","doi":"10.1109/ISCSLP49672.2021.9362072","DOIUrl":null,"url":null,"abstract":"Owing to the recent advancements in deep learning technology, the performance of voice conversion (VC) in terms of quality and similarity has significantly improved. However, complex computation is generally required for deep-learning-based VC systems. This can cause a notable latency, which limits the deployment of such VC systems in real-world applications. Therefore, increasing the efficiency of online computing has become an important task. In this study, we propose a novel mixture-of-experts (MoE) based VC system, termed MoEVC. The MoEVC system uses a gating mechanism to assign weights to feature maps to increase VC performance. In addition, applying sparse constraints on the gating mechanism can skip some convolution processes through elimination of redundant feature maps, thereby accelerating online computing. Experimental results show that by using proper sparse constraints, we can effectively reduce the FLOPs (floating-point operations) count by 70%, while improving VC performance in both objective evaluation and human subjective listening tests.","PeriodicalId":279828,"journal":{"name":"2021 12th International Symposium on Chinese Spoken Language Processing (ISCSLP)","volume":"11 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2021-01-24","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"1","resultStr":"{\"title\":\"MoEVC: A Mixture of Experts Voice Conversion System With Sparse Gating Mechanism for Online Computation Acceleration\",\"authors\":\"Yu-Tao Chang, Yuan-Hong Yang, Yu-Huai Peng, Syu-Siang Wang, T. Chi, Yu Tsao, Hsin-Min Wang\",\"doi\":\"10.1109/ISCSLP49672.2021.9362072\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"Owing to the recent advancements in deep learning technology, the performance of voice conversion (VC) in terms of quality and similarity has significantly improved. However, complex computation is generally required for deep-learning-based VC systems. This can cause a notable latency, which limits the deployment of such VC systems in real-world applications. Therefore, increasing the efficiency of online computing has become an important task. In this study, we propose a novel mixture-of-experts (MoE) based VC system, termed MoEVC. The MoEVC system uses a gating mechanism to assign weights to feature maps to increase VC performance. In addition, applying sparse constraints on the gating mechanism can skip some convolution processes through elimination of redundant feature maps, thereby accelerating online computing. Experimental results show that by using proper sparse constraints, we can effectively reduce the FLOPs (floating-point operations) count by 70%, while improving VC performance in both objective evaluation and human subjective listening tests.\",\"PeriodicalId\":279828,\"journal\":{\"name\":\"2021 12th International Symposium on Chinese Spoken Language Processing (ISCSLP)\",\"volume\":\"11 1\",\"pages\":\"0\"},\"PeriodicalIF\":0.0000,\"publicationDate\":\"2021-01-24\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"1\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"2021 12th International Symposium on Chinese Spoken Language Processing (ISCSLP)\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/10.1109/ISCSLP49672.2021.9362072\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"2021 12th International Symposium on Chinese Spoken Language Processing (ISCSLP)","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/ISCSLP49672.2021.9362072","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 1

摘要

由于近年来深度学习技术的进步,语音转换(VC)在质量和相似度方面的性能有了显著提高。然而,基于深度学习的VC系统通常需要复杂的计算。这可能会导致明显的延迟,从而限制了这种VC系统在实际应用程序中的部署。因此,提高在线计算的效率已成为一项重要的任务。在这项研究中,我们提出了一种新的基于专家混合(MoE)的VC系统,称为MoEVC。MoEVC系统使用门控机制为特征映射分配权重,以提高VC性能。此外,在门控机制上应用稀疏约束可以通过消除冗余特征映射跳过一些卷积过程,从而加快在线计算速度。实验结果表明,通过适当的稀疏约束,我们可以有效地将浮点运算次数减少70%,同时提高VC在客观评价和人类主观听力测试中的性能。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
MoEVC: A Mixture of Experts Voice Conversion System With Sparse Gating Mechanism for Online Computation Acceleration
Owing to the recent advancements in deep learning technology, the performance of voice conversion (VC) in terms of quality and similarity has significantly improved. However, complex computation is generally required for deep-learning-based VC systems. This can cause a notable latency, which limits the deployment of such VC systems in real-world applications. Therefore, increasing the efficiency of online computing has become an important task. In this study, we propose a novel mixture-of-experts (MoE) based VC system, termed MoEVC. The MoEVC system uses a gating mechanism to assign weights to feature maps to increase VC performance. In addition, applying sparse constraints on the gating mechanism can skip some convolution processes through elimination of redundant feature maps, thereby accelerating online computing. Experimental results show that by using proper sparse constraints, we can effectively reduce the FLOPs (floating-point operations) count by 70%, while improving VC performance in both objective evaluation and human subjective listening tests.
求助全文
通过发布文献求助,成功后即可免费获取论文全文。 去求助
来源期刊
自引率
0.00%
发文量
0
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
确定
请完成安全验证×
copy
已复制链接
快去分享给好友吧!
我知道了
右上角分享
点击右上角分享
0
联系我们:info@booksci.cn Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。 Copyright © 2023 布克学术 All rights reserved.
京ICP备2023020795号-1
ghs 京公网安备 11010802042870号
Book学术文献互助
Book学术文献互助群
群 号:604180095
Book学术官方微信