M. Souden, S. Araki, K. Kinoshita, T. Nakatani, H. Sawada
{"title":"一种基于多通道mmse的联合盲源分离与降噪框架","authors":"M. Souden, S. Araki, K. Kinoshita, T. Nakatani, H. Sawada","doi":"10.1109/ICASSP.2012.6287829","DOIUrl":null,"url":null,"abstract":"In this paper, we propose a new framework to separate multiple speech signals and reduce the additive acoustic noise using multiple microphones. In this framework, we start by formulating the minimum-mean-square error (MMSE) criterion to retrieve each of the desired speech signals from the observed mixtures of sounds and outline the importance of multi-speaker activity detection. The latter is modeled by introducing a latent variable whose posterior probability is computed via expectation maximization (EM) combining both the spatial and spectral cues of the multichannel speech observations. We experimentally demonstrate that the resulting joint blind source separation (BSS) and noise reduction solution performs remarkably well in reverberant and noisy environments.","PeriodicalId":6443,"journal":{"name":"2012 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP)","volume":null,"pages":null},"PeriodicalIF":0.0000,"publicationDate":"2012-03-25","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"10","resultStr":"{\"title\":\"A multichannel MMSE-based framework for joint blind source separation and noise reduction\",\"authors\":\"M. Souden, S. Araki, K. Kinoshita, T. Nakatani, H. Sawada\",\"doi\":\"10.1109/ICASSP.2012.6287829\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"In this paper, we propose a new framework to separate multiple speech signals and reduce the additive acoustic noise using multiple microphones. In this framework, we start by formulating the minimum-mean-square error (MMSE) criterion to retrieve each of the desired speech signals from the observed mixtures of sounds and outline the importance of multi-speaker activity detection. The latter is modeled by introducing a latent variable whose posterior probability is computed via expectation maximization (EM) combining both the spatial and spectral cues of the multichannel speech observations. We experimentally demonstrate that the resulting joint blind source separation (BSS) and noise reduction solution performs remarkably well in reverberant and noisy environments.\",\"PeriodicalId\":6443,\"journal\":{\"name\":\"2012 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP)\",\"volume\":null,\"pages\":null},\"PeriodicalIF\":0.0000,\"publicationDate\":\"2012-03-25\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"10\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"2012 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP)\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/10.1109/ICASSP.2012.6287829\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"2012 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP)","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/ICASSP.2012.6287829","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
A multichannel MMSE-based framework for joint blind source separation and noise reduction
In this paper, we propose a new framework to separate multiple speech signals and reduce the additive acoustic noise using multiple microphones. In this framework, we start by formulating the minimum-mean-square error (MMSE) criterion to retrieve each of the desired speech signals from the observed mixtures of sounds and outline the importance of multi-speaker activity detection. The latter is modeled by introducing a latent variable whose posterior probability is computed via expectation maximization (EM) combining both the spatial and spectral cues of the multichannel speech observations. We experimentally demonstrate that the resulting joint blind source separation (BSS) and noise reduction solution performs remarkably well in reverberant and noisy environments.