Analysis and Optimization of Wireless Multimodal Federated Learning on Modal Heterogeneity

IEEE Transactions on Machine Learning in Communications and Networking Pub Date : 2025-09-19 DOI:10.1109/TMLCN.2025.3611977

Xuefeng Han;Wen Chen;Jun Li;Ming Ding;Qingqing Wu;Kang Wei;Xiumei Deng;Yumeng Shao;Qiong Wu

{"title":"Analysis and Optimization of Wireless Multimodal Federated Learning on Modal Heterogeneity","authors":"Xuefeng Han;Wen Chen;Jun Li;Ming Ding;Qingqing Wu;Kang Wei;Xiumei Deng;Yumeng Shao;Qiong Wu","doi":"10.1109/TMLCN.2025.3611977","DOIUrl":null,"url":null,"abstract":"Multimodal federated learning (MFL) is a distributed framework for training multimodal models without uploading local multimodal data of clients, thereby effectively protecting client privacy. However, multimodal data is commonly heterogeneous across diverse clients, where each client possesses only a subset of all modalities, renders conventional analysis results and optimization methods in unimodal federated learning inapplicable. In addition, fixed latency demand and limited communication bandwidth pose significant challenges for deploying MFL in wireless scenarios. To optimize the wireless MFL performance on modal heterogeneity, this paper proposes a joint client scheduling and bandwidth allocation (JCSBA) algorithm based on a decision-level fusion architecture with adding a unimodal loss function. Specifically, with the decision results, the unimodal loss functions are added to both the training objective and local update loss functions to accelerate multimodal convergence and improve unimodal performance. To characterize MFL performance, we derive a closed-form upper bound related to client and modality scheduling and minimize the derived bound under the latency, energy, and bandwidth constraints through JCSBA. Experimental results on multimodal datasets demonstrate that the JCSBA algorithm improves the multimodal accuracy and the unimodal accuracy by 4.06% and 2.73%, respectively, compared to conventional algorithms.","PeriodicalId":100641,"journal":{"name":"IEEE Transactions on Machine Learning in Communications and Networking","volume":"3 ","pages":"1075-1091"},"PeriodicalIF":0.0000,"publicationDate":"2025-09-19","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://ieeexplore.ieee.org/stamp/stamp.jsp?tp=&arnumber=11174013","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"IEEE Transactions on Machine Learning in Communications and Networking","FirstCategoryId":"1085","ListUrlMain":"https://ieeexplore.ieee.org/document/11174013/","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}

引用次数: 0

Abstract

Multimodal federated learning (MFL) is a distributed framework for training multimodal models without uploading local multimodal data of clients, thereby effectively protecting client privacy. However, multimodal data is commonly heterogeneous across diverse clients, where each client possesses only a subset of all modalities, renders conventional analysis results and optimization methods in unimodal federated learning inapplicable. In addition, fixed latency demand and limited communication bandwidth pose significant challenges for deploying MFL in wireless scenarios. To optimize the wireless MFL performance on modal heterogeneity, this paper proposes a joint client scheduling and bandwidth allocation (JCSBA) algorithm based on a decision-level fusion architecture with adding a unimodal loss function. Specifically, with the decision results, the unimodal loss functions are added to both the training objective and local update loss functions to accelerate multimodal convergence and improve unimodal performance. To characterize MFL performance, we derive a closed-form upper bound related to client and modality scheduling and minimize the derived bound under the latency, energy, and bandwidth constraints through JCSBA. Experimental results on multimodal datasets demonstrate that the JCSBA algorithm improves the multimodal accuracy and the unimodal accuracy by 4.06% and 2.73%, respectively, compared to conventional algorithms.

查看原文本刊更多论文

基于模态异质性的无线多模态联邦学习分析与优化

多模态联邦学习（Multimodal federated learning， MFL）是一种用于训练多模态模型的分布式框架，无需上传客户端的本地多模态数据，从而有效地保护了客户端的隐私。然而，在不同的客户端中，多模态数据通常是异构的，每个客户端只拥有所有模态的一个子集，这使得单模态联邦学习中的传统分析结果和优化方法不适用。此外，固定的延迟需求和有限的通信带宽对在无线场景中部署MFL构成了重大挑战。为了优化无线MFL在模态异构方面的性能，提出了一种基于决策级融合架构并增加单峰损失函数的联合客户端调度和带宽分配（JCSBA）算法。具体而言，根据决策结果，将单峰损失函数添加到训练目标和局部更新损失函数中，以加速多峰收敛并提高单峰性能。为了表征MFL性能，我们推导了一个与客户端和模态调度相关的封闭式上界，并通过JCSBA在延迟、能量和带宽约束下最小化推导出的上界。在多模态数据集上的实验结果表明，与传统算法相比，JCSBA算法的多模态精度和单模态精度分别提高了4.06%和2.73%。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

求助全文

约1分钟内获得全文求助全文

来源期刊

IEEE Transactions on Machine Learning in Communications and Networking

自引率

0.00%

发文量