Ruqiao Liu, Yi Zhou, Hongqing Liu, Xinmeng Xu, Jie Jia, Binbin Chen
{"title":"DFBNet: Deep Neural Network based Fixed Beamformer for Multi-channel Speech Separation","authors":"Ruqiao Liu, Yi Zhou, Hongqing Liu, Xinmeng Xu, Jie Jia, Binbin Chen","doi":"10.1109/SiPS52927.2021.00042","DOIUrl":null,"url":null,"abstract":"The deep neural networks (DNNs) based beamformers have achieved significant improvements in speech separation tasks. This paper proposes a novel deep neural network (DNN) based fixed beamformer (DFBNet) that uniformly samples the space as a learning module. In addition, the initial coefficients of fixed beamformers in DFBNet are determined by the existing superdirective beamformer. Furthermore, to obtain the beams that related to each speaker, the proposed model has introduced a speech source estimation model, dual-path RNN (DPRNN), and an attention mechanism. The experimental results show that in the separation task with reverberation, the proposed way has better performance on scale-invariant signal-to-noise ratio (SI-SNR) and perceptual evaluation of speech quality (PESQ) than DPRNN and filter-and-sum network (FasNet) which is currently the most state-of-the-art temporal neural beamformer.","PeriodicalId":103894,"journal":{"name":"2021 IEEE Workshop on Signal Processing Systems (SiPS)","volume":"42 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2021-10-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"2021 IEEE Workshop on Signal Processing Systems (SiPS)","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/SiPS52927.2021.00042","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 0
Abstract
The deep neural networks (DNNs) based beamformers have achieved significant improvements in speech separation tasks. This paper proposes a novel deep neural network (DNN) based fixed beamformer (DFBNet) that uniformly samples the space as a learning module. In addition, the initial coefficients of fixed beamformers in DFBNet are determined by the existing superdirective beamformer. Furthermore, to obtain the beams that related to each speaker, the proposed model has introduced a speech source estimation model, dual-path RNN (DPRNN), and an attention mechanism. The experimental results show that in the separation task with reverberation, the proposed way has better performance on scale-invariant signal-to-noise ratio (SI-SNR) and perceptual evaluation of speech quality (PESQ) than DPRNN and filter-and-sum network (FasNet) which is currently the most state-of-the-art temporal neural beamformer.