Marvin Coto-Jiménez, John Goddard Close, L. D. Persia, H. Rufiner
{"title":"Hybrid Speech Enhancement with Wiener filters and Deep LSTM Denoising Autoencoders","authors":"Marvin Coto-Jiménez, John Goddard Close, L. D. Persia, H. Rufiner","doi":"10.1109/IWOBI.2018.8464132","DOIUrl":null,"url":null,"abstract":"Over the past several decades, numerous speech enhancement techniques have been proposed to improve the performance of modern communication devices in noisy environments. Among them, there is a large range of classical algorithms (e.g. spectral subtraction, Wiener filtering and Bayesian-based enhancement), and more recently several deep neural network-based. In this paper, we propose a hybrid approach to speech enhancement which combines two stages: In the first stage, the well-known Wiener filter performs the task of enhancing noisy speech. In the second stage, a refinement is performed using a new multi-stream approach, which involves a collection of denoising autoencoders and auto-associative memories based on Long Short-term Memory (LSTM) networks. We carry out a comparative performance analysis using two objective measures, using artificial noise added at different signal-to-noise levels. Results show that this hybrid system improves the signal's enhancement significantly in comparison to the Wiener filtering and the LSTM networks separately.","PeriodicalId":127078,"journal":{"name":"2018 IEEE International Work Conference on Bioinspired Intelligence (IWOBI)","volume":"8 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2018-07-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"14","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"2018 IEEE International Work Conference on Bioinspired Intelligence (IWOBI)","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/IWOBI.2018.8464132","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 14
Abstract
Over the past several decades, numerous speech enhancement techniques have been proposed to improve the performance of modern communication devices in noisy environments. Among them, there is a large range of classical algorithms (e.g. spectral subtraction, Wiener filtering and Bayesian-based enhancement), and more recently several deep neural network-based. In this paper, we propose a hybrid approach to speech enhancement which combines two stages: In the first stage, the well-known Wiener filter performs the task of enhancing noisy speech. In the second stage, a refinement is performed using a new multi-stream approach, which involves a collection of denoising autoencoders and auto-associative memories based on Long Short-term Memory (LSTM) networks. We carry out a comparative performance analysis using two objective measures, using artificial noise added at different signal-to-noise levels. Results show that this hybrid system improves the signal's enhancement significantly in comparison to the Wiener filtering and the LSTM networks separately.