{"title":"Spectro-temporal Filtering based on The Beta-divergence for Speech Separation using Nonnegative Matrix Factorization","authors":"M. Fakhry","doi":"10.1109/ISRITI54043.2021.9702880","DOIUrl":null,"url":null,"abstract":"Nonnegative matrix factorization (NMF) has shown high effectiveness to perform supervised speech separation. In this context, nonnegative spectral basis matrices representing sources in an observed mixture, are trained independently. The trained matrices are used later to compute the corresponding nonnegative temporal activation matrices. Estimations of the source signals in the mixture are obtained through Wiener gains by minimizing the Euclidean distance between true and estimated source signals. In this paper, we propose to quantify such a distance using the Beta-divergence ($\\beta$-divergence), which has been successfully used to accomplish NMF. The proposed gains are derived by minimizing the distance measured by the divergence, and it is involved afterward in the context of supervised NMF for speech separation. The experimental evaluation concludes that the gain computed by the Beta-divergence with $\\beta=1.5$, provides better performance compared to the conventional Wiener gain.","PeriodicalId":156265,"journal":{"name":"2021 4th International Seminar on Research of Information Technology and Intelligent Systems (ISRITI)","volume":"4 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2021-12-16","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"2021 4th International Seminar on Research of Information Technology and Intelligent Systems (ISRITI)","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/ISRITI54043.2021.9702880","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 0
Abstract
Nonnegative matrix factorization (NMF) has shown high effectiveness to perform supervised speech separation. In this context, nonnegative spectral basis matrices representing sources in an observed mixture, are trained independently. The trained matrices are used later to compute the corresponding nonnegative temporal activation matrices. Estimations of the source signals in the mixture are obtained through Wiener gains by minimizing the Euclidean distance between true and estimated source signals. In this paper, we propose to quantify such a distance using the Beta-divergence ($\beta$-divergence), which has been successfully used to accomplish NMF. The proposed gains are derived by minimizing the distance measured by the divergence, and it is involved afterward in the context of supervised NMF for speech separation. The experimental evaluation concludes that the gain computed by the Beta-divergence with $\beta=1.5$, provides better performance compared to the conventional Wiener gain.