Deep Attractor with Convolutional Network for Monaural Speech Separation

Tian Lan, Yuxin Qian, Wenxin Tai, Boce Chu, Qiao Liu
{"title":"Deep Attractor with Convolutional Network for Monaural Speech Separation","authors":"Tian Lan, Yuxin Qian, Wenxin Tai, Boce Chu, Qiao Liu","doi":"10.1109/UEMCON51285.2020.9298070","DOIUrl":null,"url":null,"abstract":"Deep attractor network (DANet) is a recent deep learning-based method for monaural speech separation. The idea is to map the time-frequency bins from the spectrogram to the embedding space and form attractors for each source to estimate masks. The original deep attractor network uses true assignments of speaker to form attractors during training, but K-means algorithm or fixed attractor method is used during the test phase to estimate attractors. The fixed attractor method does not perform well when training and test condition is different. Using K-means algorithm during test raises a center mismatch problem, which leads to performance degradation. In this letter, we propose to use convolutional networks for estimating attractors in the training and test phases. By using the same method to generate attractors, the center mismatch problem is solved. Results revealed that the proposed method achieves better performance than DANet using K-means method and gets comparable performance with DANet using ideal binary mask during test with limited training data.","PeriodicalId":433609,"journal":{"name":"2020 11th IEEE Annual Ubiquitous Computing, Electronics & Mobile Communication Conference (UEMCON)","volume":"29 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2020-10-28","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"2020 11th IEEE Annual Ubiquitous Computing, Electronics & Mobile Communication Conference (UEMCON)","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/UEMCON51285.2020.9298070","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 0

Abstract

Deep attractor network (DANet) is a recent deep learning-based method for monaural speech separation. The idea is to map the time-frequency bins from the spectrogram to the embedding space and form attractors for each source to estimate masks. The original deep attractor network uses true assignments of speaker to form attractors during training, but K-means algorithm or fixed attractor method is used during the test phase to estimate attractors. The fixed attractor method does not perform well when training and test condition is different. Using K-means algorithm during test raises a center mismatch problem, which leads to performance degradation. In this letter, we propose to use convolutional networks for estimating attractors in the training and test phases. By using the same method to generate attractors, the center mismatch problem is solved. Results revealed that the proposed method achieves better performance than DANet using K-means method and gets comparable performance with DANet using ideal binary mask during test with limited training data.
基于卷积网络的单耳语音分离深度吸引子
深度吸引子网络(DANet)是一种基于深度学习的单耳语音分离方法。其思想是将谱图中的时频箱映射到嵌入空间,并为每个源形成吸引子来估计掩模。原始的深度吸引器网络在训练阶段使用说话人的真实分配来形成吸引器,而在测试阶段使用K-means算法或固定吸引器方法来估计吸引器。在训练和测试条件不同的情况下,固定吸引器法的效果不佳。在测试过程中使用K-means算法会产生中心不匹配问题,从而导致性能下降。在这封信中,我们建议在训练和测试阶段使用卷积网络来估计吸引子。采用相同的方法生成吸引子,解决了中心不匹配问题。结果表明,在训练数据有限的情况下,该方法的性能优于使用K-means方法的DANet,与使用理想二值掩码的DANet性能相当。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
求助全文
约1分钟内获得全文 求助全文
来源期刊
自引率
0.00%
发文量
0
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
确定
请完成安全验证×
copy
已复制链接
快去分享给好友吧!
我知道了
右上角分享
点击右上角分享
0
联系我们:info@booksci.cn Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。 Copyright © 2023 布克学术 All rights reserved.
京ICP备2023020795号-1
ghs 京公网安备 11010802042870号
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术官方微信