Separate But Together: Unsupervised Federated Learning for Speech Enhancement from Non-IID Data

2021 IEEE Workshop on Applications of Signal Processing to Audio and Acoustics (WASPAA) Pub Date : 2021-05-11 DOI:10.1109/WASPAA52581.2021.9632783

Efthymios Tzinis, Jonah Casebeer, Zhepei Wang, P. Smaragdis

{"title":"Separate But Together: Unsupervised Federated Learning for Speech Enhancement from Non-IID Data","authors":"Efthymios Tzinis, Jonah Casebeer, Zhepei Wang, P. Smaragdis","doi":"10.1109/WASPAA52581.2021.9632783","DOIUrl":null,"url":null,"abstract":"We propose FedEnhance, an unsupervised federated learning (FL) approach for speech enhancement and separation with non-IID distributed data across multiple clients. We simulate a realworld scenario where each client only has access to a few noisy recordings from a limited and disjoint number of speakers (hence non-IID). Each client trains their model in isolation using mixture invariant training while periodically providing updates to a central server. Our experiments show that our approach achieves competitive enhancement performance compared to IID training on a single device and that we can further facilitate the convergence speed and the overall performance using transfer learning on the server-side. Moreover, we show that we can effectively combine updates from clients trained locally with supervised and unsupervised losses. We also release a new dataset LibriFSD50K and its creation recipe in order to facilitate FL research for source separation problems.","PeriodicalId":429900,"journal":{"name":"2021 IEEE Workshop on Applications of Signal Processing to Audio and Acoustics (WASPAA)","volume":"32 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2021-05-11","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"13","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"2021 IEEE Workshop on Applications of Signal Processing to Audio and Acoustics (WASPAA)","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/WASPAA52581.2021.9632783","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}

引用次数: 13

Abstract

We propose FedEnhance, an unsupervised federated learning (FL) approach for speech enhancement and separation with non-IID distributed data across multiple clients. We simulate a realworld scenario where each client only has access to a few noisy recordings from a limited and disjoint number of speakers (hence non-IID). Each client trains their model in isolation using mixture invariant training while periodically providing updates to a central server. Our experiments show that our approach achieves competitive enhancement performance compared to IID training on a single device and that we can further facilitate the convergence speed and the overall performance using transfer learning on the server-side. Moreover, we show that we can effectively combine updates from clients trained locally with supervised and unsupervised losses. We also release a new dataset LibriFSD50K and its creation recipe in order to facilitate FL research for source separation problems.

查看原文本刊更多论文

分离但在一起:非iid数据语音增强的无监督联邦学习

我们提出了FedEnhance，一种无监督联邦学习(FL)方法，用于语音增强和跨多个客户端的非iid分布式数据分离。我们模拟了一个现实世界的场景，其中每个客户端只能访问来自有限且不连接的扬声器数量的一些嘈杂录音(因此是非iid)。每个客户机使用混合不变式训练单独训练它们的模型，同时定期向中央服务器提供更新。我们的实验表明，与在单个设备上进行IID训练相比，我们的方法实现了具有竞争力的增强性能，并且我们可以在服务器端使用迁移学习进一步促进收敛速度和整体性能。此外，我们表明，我们可以有效地将本地训练的客户的更新与监督和无监督损失结合起来。我们还发布了一个新的数据集LibriFSD50K及其创建配方，以方便FL研究源分离问题。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

求助全文

约1分钟内获得全文求助全文

来源期刊

2021 IEEE Workshop on Applications of Signal Processing to Audio and Acoustics (WASPAA)

自引率

0.00%

发文量