Blind speech separation exploiting temporal and spectral correlations using 2D-HMMs

21st European Signal Processing Conference (EUSIPCO 2013) Pub Date : 2013-09-09 DOI:10.5281/ZENODO.43429

Dang Hai Tran Vu, Reinhold Häb-Umbach

引用次数: 5

Abstract

We present a novel method to exploit correlations of adjacent time-frequency (TF)-slots for a sparseness-based blind speech separation (BSS) system. Usually, these correlations are exploited by some heuristic smoothing techniques in the post-processing of the estimated soft TF masks. We propose a different approach: Based on our previous work with one-dimensional (1D)-hidden Markov models (HMMs) along the time axis we extend the modeling to two-dimensional (2D)-HMMs to exploit both temporal and spectral correlations in the speech signal. Based on the principles of turbo decoding we solved the complex inference of 2D-HMMs by a modified forward-backward algorithm which operates alternatingly along the time and the frequency axis. Extrinsic information is exchanged between these steps such that increasingly better soft time-frequency masks are obtained, leading to improved speech separation performance in highly reverberant recording conditions.

查看原文本刊更多论文

利用2d - hmm的时间和频谱相关性进行盲语音分离

提出了一种利用相邻时频(TF)隙的相关性的新方法，用于稀疏性盲语音分离系统。通常，在估计的软TF掩模的后处理中，利用一些启发式平滑技术来利用这些相关性。我们提出了一种不同的方法:基于我们之前沿着时间轴的一维(1D)隐马尔可夫模型(hmm)的工作，我们将建模扩展到二维(2D)隐马尔可夫模型，以利用语音信号中的时间和频谱相关性。基于turbo译码原理，提出了一种改进的沿时间轴和频率轴交替工作的前向后算法，解决了二维hmm的复杂推理问题。在这些步骤之间交换外部信息，从而获得越来越好的软时频掩模，从而在高混响记录条件下提高语音分离性能。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

求助全文

约1分钟内获得全文求助全文

来源期刊

21st European Signal Processing Conference (EUSIPCO 2013)

自引率

0.00%

发文量