Michael Müller, Jakub Janský, M. Bohac, Zbyněk Koldovský
{"title":"Linear acoustic echo cancellation using deep neural networks and convex reconstruction of incomplete transfer function","authors":"Michael Müller, Jakub Janský, M. Bohac, Zbyněk Koldovský","doi":"10.1109/ECMSM.2017.7945913","DOIUrl":null,"url":null,"abstract":"Linear acoustic path estimation for acoustic echo cancellation is difficult during periods where the near-end signal (speech) is active. In this paper, we assume that the impulse response is sparse. There are many algorithms that solve the problem of estimating sparse impulse response in the time domain. In this paper, we propose algorithms working in the time-frequency domain. In our approach, it is assumed that the respective transfer function can be estimated only for those frequencies where the near-end signal is not active. First, a deep neural network trained on mixed signals is used to detect the activity of the near-end signal. In frequencies where no activity is detected, the acoustic transfer function is estimated using conventional frequency domain least squares. This results in an incomplete transfer function (ITF) estimate. The completion is done through finding the sparsest representation of the ITF in the time domain. This can be done adaptively using the soft-threshold function, which is applied in the time domain. To achieve improved accuracy, oversampling can be used.","PeriodicalId":358140,"journal":{"name":"2017 IEEE International Workshop of Electronics, Control, Measurement, Signals and their Application to Mechatronics (ECMSM)","volume":"26 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2017-05-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"3","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"2017 IEEE International Workshop of Electronics, Control, Measurement, Signals and their Application to Mechatronics (ECMSM)","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/ECMSM.2017.7945913","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 3
Abstract
Linear acoustic path estimation for acoustic echo cancellation is difficult during periods where the near-end signal (speech) is active. In this paper, we assume that the impulse response is sparse. There are many algorithms that solve the problem of estimating sparse impulse response in the time domain. In this paper, we propose algorithms working in the time-frequency domain. In our approach, it is assumed that the respective transfer function can be estimated only for those frequencies where the near-end signal is not active. First, a deep neural network trained on mixed signals is used to detect the activity of the near-end signal. In frequencies where no activity is detected, the acoustic transfer function is estimated using conventional frequency domain least squares. This results in an incomplete transfer function (ITF) estimate. The completion is done through finding the sparsest representation of the ITF in the time domain. This can be done adaptively using the soft-threshold function, which is applied in the time domain. To achieve improved accuracy, oversampling can be used.