MMFed：异构设备的多模态联邦学习框架

IF 8.9 1区计算机科学 Q1 COMPUTER SCIENCE, INFORMATION SYSTEMS

IEEE Internet of Things Journal Pub Date : 2025-07-02 DOI:10.1109/JIOT.2025.3579858

Gang Wang;Yanfeng Zhang;Chenhao Ying;Qinnan Zhang;Zehui Xiong;Jiakang Wang;Ge Yu

{"title":"MMFed：异构设备的多模态联邦学习框架","authors":"Gang Wang;Yanfeng Zhang;Chenhao Ying;Qinnan Zhang;Zehui Xiong;Jiakang Wang;Ge Yu","doi":"10.1109/JIOT.2025.3579858","DOIUrl":null,"url":null,"abstract":"Existing federated learning (FL) frameworks are primarily designed for single-modal data. However, real-world scenarios require processing multimodal data on heterogeneous devices. The gap between existing methods and real-world scenarios presents challenges in processing multimodal data on heterogeneous devices, significantly impacting model training efficiency. To address these issues, we propose a multimodal FL framework, MMFed, which integrates multimodal algorithms with a semi-synchronous training method. The multimodal algorithm trains local autoencoders on different data modalities. By leveraging the similarity of encodings across different modalities with the same data labels, we further train and aggregate these local autoencoders into a global autoencoder, which is then deployed on the blockchain to perform downstream classification tasks. In the semi-synchronous training method, each device updates its parameters independently during a round. At the end of each round, a global aggregation combines the updates from devices. We conduct an empirical evaluation of our framework on various multimodal datasets, including Opportunity (Opp) Challenge, mHealth, and UR Fall Detection datasets. Experimental results demonstrate that our FL framework, MMFed, outperforms the state-of-the-art multimodal frameworks on three multimodal datasets, achieving an average accuracy improvement of 9.07%. Furthermore, in terms of training speed, MMFed is obviously superior to synchronization strategies when it is extended to a large number of clients.","PeriodicalId":54347,"journal":{"name":"IEEE Internet of Things Journal","volume":"12 18","pages":"36893-36907"},"PeriodicalIF":8.9000,"publicationDate":"2025-07-02","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":"{\"title\":\"MMFed: A Multimodal Federated Learning Framework for Heterogeneous Devices\",\"authors\":\"Gang Wang;Yanfeng Zhang;Chenhao Ying;Qinnan Zhang;Zehui Xiong;Jiakang Wang;Ge Yu\",\"doi\":\"10.1109/JIOT.2025.3579858\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"Existing federated learning (FL) frameworks are primarily designed for single-modal data. However, real-world scenarios require processing multimodal data on heterogeneous devices. The gap between existing methods and real-world scenarios presents challenges in processing multimodal data on heterogeneous devices, significantly impacting model training efficiency. To address these issues, we propose a multimodal FL framework, MMFed, which integrates multimodal algorithms with a semi-synchronous training method. The multimodal algorithm trains local autoencoders on different data modalities. By leveraging the similarity of encodings across different modalities with the same data labels, we further train and aggregate these local autoencoders into a global autoencoder, which is then deployed on the blockchain to perform downstream classification tasks. In the semi-synchronous training method, each device updates its parameters independently during a round. At the end of each round, a global aggregation combines the updates from devices. We conduct an empirical evaluation of our framework on various multimodal datasets, including Opportunity (Opp) Challenge, mHealth, and UR Fall Detection datasets. Experimental results demonstrate that our FL framework, MMFed, outperforms the state-of-the-art multimodal frameworks on three multimodal datasets, achieving an average accuracy improvement of 9.07%. Furthermore, in terms of training speed, MMFed is obviously superior to synchronization strategies when it is extended to a large number of clients.\",\"PeriodicalId\":54347,\"journal\":{\"name\":\"IEEE Internet of Things Journal\",\"volume\":\"12 18\",\"pages\":\"36893-36907\"},\"PeriodicalIF\":8.9000,\"publicationDate\":\"2025-07-02\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"0\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"IEEE Internet of Things Journal\",\"FirstCategoryId\":\"94\",\"ListUrlMain\":\"https://ieeexplore.ieee.org/document/11063398/\",\"RegionNum\":1,\"RegionCategory\":\"计算机科学\",\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"Q1\",\"JCRName\":\"COMPUTER SCIENCE, INFORMATION SYSTEMS\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"IEEE Internet of Things Journal","FirstCategoryId":"94","ListUrlMain":"https://ieeexplore.ieee.org/document/11063398/","RegionNum":1,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q1","JCRName":"COMPUTER SCIENCE, INFORMATION SYSTEMS","Score":null,"Total":0}

引用次数: 0

摘要

现有的联邦学习（FL）框架主要是为单模态数据设计的。然而，现实场景需要在异构设备上处理多模态数据。现有方法与现实场景之间的差距给在异构设备上处理多模态数据带来了挑战，严重影响了模型训练效率。为了解决这些问题，我们提出了一个多模态FL框架MMFed，它将多模态算法与半同步训练方法集成在一起。多模态算法在不同的数据模态上训练局部自编码器。通过利用具有相同数据标签的不同模式之间编码的相似性，我们进一步训练并将这些本地自编码器聚合为全局自编码器，然后将其部署在区块链上执行下游分类任务。在半同步训练方法中，每个设备在一轮中独立更新其参数。在每一轮结束时，一个全局聚合组合来自设备的更新。我们在各种多模式数据集上对我们的框架进行了实证评估，包括机会（Opp）挑战、移动健康和UR跌倒检测数据集。实验结果表明，我们的FL框架MMFed在三个多模态数据集上优于最先进的多模态框架，平均准确率提高了9.07%。此外，在训练速度方面，MMFed扩展到大量客户机时明显优于同步策略。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

查看原文本刊更多论文

MMFed: A Multimodal Federated Learning Framework for Heterogeneous Devices

Existing federated learning (FL) frameworks are primarily designed for single-modal data. However, real-world scenarios require processing multimodal data on heterogeneous devices. The gap between existing methods and real-world scenarios presents challenges in processing multimodal data on heterogeneous devices, significantly impacting model training efficiency. To address these issues, we propose a multimodal FL framework, MMFed, which integrates multimodal algorithms with a semi-synchronous training method. The multimodal algorithm trains local autoencoders on different data modalities. By leveraging the similarity of encodings across different modalities with the same data labels, we further train and aggregate these local autoencoders into a global autoencoder, which is then deployed on the blockchain to perform downstream classification tasks. In the semi-synchronous training method, each device updates its parameters independently during a round. At the end of each round, a global aggregation combines the updates from devices. We conduct an empirical evaluation of our framework on various multimodal datasets, including Opportunity (Opp) Challenge, mHealth, and UR Fall Detection datasets. Experimental results demonstrate that our FL framework, MMFed, outperforms the state-of-the-art multimodal frameworks on three multimodal datasets, achieving an average accuracy improvement of 9.07%. Furthermore, in terms of training speed, MMFed is obviously superior to synchronization strategies when it is extended to a large number of clients.

求助全文

通过发布文献求助，成功后即可免费获取论文全文。去求助

来源期刊

IEEE Internet of Things Journal Computer Science-Information Systems

CiteScore

17.60

自引率

13.20%

发文量

1982

期刊介绍： The EEE Internet of Things (IoT) Journal publishes articles and review articles covering various aspects of IoT, including IoT system architecture, IoT enabling technologies, IoT communication and networking protocols such as network coding, and IoT services and applications. Topics encompass IoT's impacts on sensor technologies, big data management, and future internet design for applications like smart cities and smart homes. Fields of interest include IoT architecture such as things-centric, data-centric, service-oriented IoT architecture; IoT enabling technologies and systematic integration such as sensor technologies, big sensor data management, and future Internet design for IoT; IoT services, applications, and test-beds such as IoT service middleware, IoT application programming interface (API), IoT application design, and IoT trials/experiments; IoT standardization activities and technology development in different standard development organizations (SDO) such as IEEE, IETF, ITU, 3GPP, ETSI, etc.