Zbyněk Koldovský, F. Nesta, P. Tichavský, Nobutaka Ono
{"title":"Frequency-domain blind speech separation using incomplete de-mixing transform","authors":"Zbyněk Koldovský, F. Nesta, P. Tichavský, Nobutaka Ono","doi":"10.1109/EUSIPCO.2016.7760531","DOIUrl":null,"url":null,"abstract":"We propose a novel solution to the blind speech separation problem where the de-mixing transform is estimated only within selected frequency bins. This solution is based on Independent Vector Analysis applied to a subset of instantaneous mixtures, one per selected frequency bin. Next, two approaches are proposed to complete the transform: one based on null beamforming, and the other based on convex programming. In subsequent experiments, we compare combinations of both methods and evaluate their ability to retrieve the whole de-mixing transform. Depending on the number of selected frequencies and the sparsity of room impulse responses, the methods show improvements in terms of computational complexity as well as in terms of separation accuracy.","PeriodicalId":127068,"journal":{"name":"2016 24th European Signal Processing Conference (EUSIPCO)","volume":"227 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2016-08-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"4","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"2016 24th European Signal Processing Conference (EUSIPCO)","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/EUSIPCO.2016.7760531","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 4
Abstract
We propose a novel solution to the blind speech separation problem where the de-mixing transform is estimated only within selected frequency bins. This solution is based on Independent Vector Analysis applied to a subset of instantaneous mixtures, one per selected frequency bin. Next, two approaches are proposed to complete the transform: one based on null beamforming, and the other based on convex programming. In subsequent experiments, we compare combinations of both methods and evaluate their ability to retrieve the whole de-mixing transform. Depending on the number of selected frequencies and the sparsity of room impulse responses, the methods show improvements in terms of computational complexity as well as in terms of separation accuracy.