Multimodal Federated Learning on IoT Data

2022 IEEE/ACM Seventh International Conference on Internet-of-Things Design and Implementation (IoTDI) Pub Date : 2021-09-10 DOI:10.1109/iotdi54339.2022.00011

Yuchen Zhao, P. Barnaghi, H. Haddadi

{"title":"Multimodal Federated Learning on IoT Data","authors":"Yuchen Zhao, P. Barnaghi, H. Haddadi","doi":"10.1109/iotdi54339.2022.00011","DOIUrl":null,"url":null,"abstract":"Federated learning is proposed as an alternative to centralized machine learning since its client-server structure provides better privacy protection and scalability in real-world applications. In many applications, such as smart homes with Internet-of-Things (IoT) devices, local data on clients are generated from different modalities such as sensory, visual, and audio data. Existing federated learning systems only work on local data from a single modality, which limits the scalability of the systems. In this paper, we propose a multimodal and semi-supervised federated learning framework that trains autoencoders to extract shared or correlated representations from different local data modalities on clients. In addition, we propose a multimodal FedAvg algorithm to aggregate local autoencoders trained on different data modalities. We use the learned global autoencoder for a downstream classification task with the help of auxiliary labelled data on the server. We empirically evaluate our framework on different modalities including sensory data, depth camera videos, and RGB camera videos. Our experimental results demonstrate that introducing data from multiple modalities into federated learning can improve its classification performance. In addition, we can use labelled data from only one modality for supervised learning on the server and apply the learned model to testing data from other modalities to achieve decent $F_{1}$ scores (e.g., with the best performance being higher than 60%), especially when combining contributions from both unimodal clients and multimodal clients.","PeriodicalId":314074,"journal":{"name":"2022 IEEE/ACM Seventh International Conference on Internet-of-Things Design and Implementation (IoTDI)","volume":"11 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2021-09-10","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"23","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"2022 IEEE/ACM Seventh International Conference on Internet-of-Things Design and Implementation (IoTDI)","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/iotdi54339.2022.00011","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}

引用次数: 23

Abstract

Federated learning is proposed as an alternative to centralized machine learning since its client-server structure provides better privacy protection and scalability in real-world applications. In many applications, such as smart homes with Internet-of-Things (IoT) devices, local data on clients are generated from different modalities such as sensory, visual, and audio data. Existing federated learning systems only work on local data from a single modality, which limits the scalability of the systems. In this paper, we propose a multimodal and semi-supervised federated learning framework that trains autoencoders to extract shared or correlated representations from different local data modalities on clients. In addition, we propose a multimodal FedAvg algorithm to aggregate local autoencoders trained on different data modalities. We use the learned global autoencoder for a downstream classification task with the help of auxiliary labelled data on the server. We empirically evaluate our framework on different modalities including sensory data, depth camera videos, and RGB camera videos. Our experimental results demonstrate that introducing data from multiple modalities into federated learning can improve its classification performance. In addition, we can use labelled data from only one modality for supervised learning on the server and apply the learned model to testing data from other modalities to achieve decent $F_{1}$ scores (e.g., with the best performance being higher than 60%), especially when combining contributions from both unimodal clients and multimodal clients.

查看原文本刊更多论文

物联网数据的多模态联邦学习

联邦学习被提议作为集中式机器学习的替代方案，因为它的客户机-服务器结构在实际应用程序中提供了更好的隐私保护和可伸缩性。在许多应用中，例如带有物联网(IoT)设备的智能家居，客户端的本地数据由不同的模式生成，如感官、视觉和音频数据。现有的联邦学习系统只能处理来自单一模态的本地数据，这限制了系统的可扩展性。在本文中，我们提出了一个多模态和半监督的联邦学习框架，该框架训练自编码器从客户端的不同本地数据模态中提取共享或相关表示。此外，我们提出了一种多模态fedag算法来聚合在不同数据模态上训练的本地自编码器。我们使用学习到的全局自编码器在服务器上辅助标记数据的帮助下进行下游分类任务。我们在不同的模式下对我们的框架进行了实证评估，包括感官数据、深度摄像机视频和RGB摄像机视频。实验结果表明，在联邦学习中引入多模态数据可以提高联邦学习的分类性能。此外，我们可以在服务器上使用来自一种模态的标记数据进行监督学习，并将学习到的模型应用于测试来自其他模态的数据，以获得体面的$F_{1}$分数(例如，最佳性能高于60%)，特别是在结合来自单模态客户端和多模态客户端的贡献时。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

求助全文

约1分钟内获得全文求助全文

来源期刊

2022 IEEE/ACM Seventh International Conference on Internet-of-Things Design and Implementation (IoTDI)

自引率

0.00%

发文量