Efthymios Tzinis, Jonah Casebeer, Zhepei Wang, P. Smaragdis
{"title":"Separate But Together: Unsupervised Federated Learning for Speech Enhancement from Non-IID Data","authors":"Efthymios Tzinis, Jonah Casebeer, Zhepei Wang, P. Smaragdis","doi":"10.1109/WASPAA52581.2021.9632783","DOIUrl":null,"url":null,"abstract":"We propose FedEnhance, an unsupervised federated learning (FL) approach for speech enhancement and separation with non-IID distributed data across multiple clients. We simulate a realworld scenario where each client only has access to a few noisy recordings from a limited and disjoint number of speakers (hence non-IID). Each client trains their model in isolation using mixture invariant training while periodically providing updates to a central server. Our experiments show that our approach achieves competitive enhancement performance compared to IID training on a single device and that we can further facilitate the convergence speed and the overall performance using transfer learning on the server-side. Moreover, we show that we can effectively combine updates from clients trained locally with supervised and unsupervised losses. We also release a new dataset LibriFSD50K and its creation recipe in order to facilitate FL research for source separation problems.","PeriodicalId":429900,"journal":{"name":"2021 IEEE Workshop on Applications of Signal Processing to Audio and Acoustics (WASPAA)","volume":"32 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2021-05-11","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"13","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"2021 IEEE Workshop on Applications of Signal Processing to Audio and Acoustics (WASPAA)","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/WASPAA52581.2021.9632783","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 13
Abstract
We propose FedEnhance, an unsupervised federated learning (FL) approach for speech enhancement and separation with non-IID distributed data across multiple clients. We simulate a realworld scenario where each client only has access to a few noisy recordings from a limited and disjoint number of speakers (hence non-IID). Each client trains their model in isolation using mixture invariant training while periodically providing updates to a central server. Our experiments show that our approach achieves competitive enhancement performance compared to IID training on a single device and that we can further facilitate the convergence speed and the overall performance using transfer learning on the server-side. Moreover, we show that we can effectively combine updates from clients trained locally with supervised and unsupervised losses. We also release a new dataset LibriFSD50K and its creation recipe in order to facilitate FL research for source separation problems.