噪声语音去噪的深度自监督学习

Y. Sanada, Takumi Nakagawa, Yuichiro Wada, K. Takanashi, Yuhui Zhang, Kiichi Tokuyama, T. Kanamori, Tomonori Yamada
{"title":"噪声语音去噪的深度自监督学习","authors":"Y. Sanada, Takumi Nakagawa, Yuichiro Wada, K. Takanashi, Yuhui Zhang, Kiichi Tokuyama, T. Kanamori, Tomonori Yamada","doi":"10.21437/interspeech.2022-306","DOIUrl":null,"url":null,"abstract":"In the last few years, unsupervised learning methods have been proposed in speech denoising by taking advantage of Deep Neural Networks (DNNs). The reason is that such unsupervised methods are more practical than the supervised counterparts. In our scenario, we are given a set of noisy speech data, where any two data do not share the same clean data. Our goal is to obtain the denoiser by training a DNN based model. Using the set, we train the model via the following two steps: 1) From the noisy speech data, construct another noisy speech data via our proposed masking technique. 2) Minimize our proposed loss defined from the DNN and the two noisy speech data. We evaluate our method using Gaussian and real-world noises in our numerical experiments. As a result, our method outperforms the state-of-the-art method on average for both noises. In addi-tion, we provide the theoretical explanation of why our method can be efficient if the noise has Gaussian distribution.","PeriodicalId":73500,"journal":{"name":"Interspeech","volume":"1 1","pages":"1178-1182"},"PeriodicalIF":0.0000,"publicationDate":"2022-09-18","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"1","resultStr":"{\"title\":\"Deep Self-Supervised Learning of Speech Denoising from Noisy Speeches\",\"authors\":\"Y. Sanada, Takumi Nakagawa, Yuichiro Wada, K. Takanashi, Yuhui Zhang, Kiichi Tokuyama, T. Kanamori, Tomonori Yamada\",\"doi\":\"10.21437/interspeech.2022-306\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"In the last few years, unsupervised learning methods have been proposed in speech denoising by taking advantage of Deep Neural Networks (DNNs). The reason is that such unsupervised methods are more practical than the supervised counterparts. In our scenario, we are given a set of noisy speech data, where any two data do not share the same clean data. Our goal is to obtain the denoiser by training a DNN based model. Using the set, we train the model via the following two steps: 1) From the noisy speech data, construct another noisy speech data via our proposed masking technique. 2) Minimize our proposed loss defined from the DNN and the two noisy speech data. We evaluate our method using Gaussian and real-world noises in our numerical experiments. As a result, our method outperforms the state-of-the-art method on average for both noises. In addi-tion, we provide the theoretical explanation of why our method can be efficient if the noise has Gaussian distribution.\",\"PeriodicalId\":73500,\"journal\":{\"name\":\"Interspeech\",\"volume\":\"1 1\",\"pages\":\"1178-1182\"},\"PeriodicalIF\":0.0000,\"publicationDate\":\"2022-09-18\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"1\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"Interspeech\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/10.21437/interspeech.2022-306\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"Interspeech","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.21437/interspeech.2022-306","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 1

摘要

近年来,人们利用深度神经网络(DNNs)在语音去噪中提出了无监督学习方法。原因是这种无监督的方法比有监督的方法更实用。在我们的场景中,我们得到了一组有噪声的语音数据,其中任何两个数据都不共享相同的干净数据。我们的目标是通过训练基于DNN的模型来获得去噪器。使用该集合,我们通过以下两个步骤训练模型:1)从有噪声的语音数据中,通过我们提出的掩蔽技术构造另一个有噪声的话音数据。2) 最大限度地减少我们从DNN和两个有噪声的语音数据中确定的拟议损失。我们在数值实验中使用高斯噪声和真实世界中的噪声来评估我们的方法。因此,我们的方法在两种噪声方面的平均性能都优于最先进的方法。此外,我们还提供了理论解释,说明如果噪声具有高斯分布,为什么我们的方法是有效的。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
Deep Self-Supervised Learning of Speech Denoising from Noisy Speeches
In the last few years, unsupervised learning methods have been proposed in speech denoising by taking advantage of Deep Neural Networks (DNNs). The reason is that such unsupervised methods are more practical than the supervised counterparts. In our scenario, we are given a set of noisy speech data, where any two data do not share the same clean data. Our goal is to obtain the denoiser by training a DNN based model. Using the set, we train the model via the following two steps: 1) From the noisy speech data, construct another noisy speech data via our proposed masking technique. 2) Minimize our proposed loss defined from the DNN and the two noisy speech data. We evaluate our method using Gaussian and real-world noises in our numerical experiments. As a result, our method outperforms the state-of-the-art method on average for both noises. In addi-tion, we provide the theoretical explanation of why our method can be efficient if the noise has Gaussian distribution.
求助全文
通过发布文献求助,成功后即可免费获取论文全文。 去求助
来源期刊
自引率
0.00%
发文量
0
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
确定
请完成安全验证×
copy
已复制链接
快去分享给好友吧!
我知道了
右上角分享
点击右上角分享
0
联系我们:info@booksci.cn Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。 Copyright © 2023 布克学术 All rights reserved.
京ICP备2023020795号-1
ghs 京公网安备 11010802042870号
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术官方微信