{"title":"Analysis and Optimization of Wireless Multimodal Federated Learning on Modal Heterogeneity","authors":"Xuefeng Han;Wen Chen;Jun Li;Ming Ding;Qingqing Wu;Kang Wei;Xiumei Deng;Yumeng Shao;Qiong Wu","doi":"10.1109/TMLCN.2025.3611977","DOIUrl":"https://doi.org/10.1109/TMLCN.2025.3611977","url":null,"abstract":"Multimodal federated learning (MFL) is a distributed framework for training multimodal models without uploading local multimodal data of clients, thereby effectively protecting client privacy. However, multimodal data is commonly heterogeneous across diverse clients, where each client possesses only a subset of all modalities, renders conventional analysis results and optimization methods in unimodal federated learning inapplicable. In addition, fixed latency demand and limited communication bandwidth pose significant challenges for deploying MFL in wireless scenarios. To optimize the wireless MFL performance on modal heterogeneity, this paper proposes a joint client scheduling and bandwidth allocation (JCSBA) algorithm based on a decision-level fusion architecture with adding a unimodal loss function. Specifically, with the decision results, the unimodal loss functions are added to both the training objective and local update loss functions to accelerate multimodal convergence and improve unimodal performance. To characterize MFL performance, we derive a closed-form upper bound related to client and modality scheduling and minimize the derived bound under the latency, energy, and bandwidth constraints through JCSBA. Experimental results on multimodal datasets demonstrate that the JCSBA algorithm improves the multimodal accuracy and the unimodal accuracy by 4.06% and 2.73%, respectively, compared to conventional algorithms.","PeriodicalId":100641,"journal":{"name":"IEEE Transactions on Machine Learning in Communications and Networking","volume":"3 ","pages":"1075-1091"},"PeriodicalIF":0.0,"publicationDate":"2025-09-19","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://ieeexplore.ieee.org/stamp/stamp.jsp?tp=&arnumber=11174013","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"145210159","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Maheshi Lokumarambage;Thushan Sivalingam;Feng Dong;Nandana Rajatheva;Anil Fernando
{"title":"Generative AI-Based Vector Quantized End-to-End Semantic Communication System for Wireless Image Transmission","authors":"Maheshi Lokumarambage;Thushan Sivalingam;Feng Dong;Nandana Rajatheva;Anil Fernando","doi":"10.1109/TMLCN.2025.3607891","DOIUrl":"https://doi.org/10.1109/TMLCN.2025.3607891","url":null,"abstract":"Semantic communication (SemCom) systems enhance transmission efficiency by conveying semantic information in lieu of raw data. However, challenges arise when designing these systems due to the need for robust semantic source coding for information representation extending beyond the training dataset, maintaining channel-agnostic performance, and ensuring robustness to channel and semantic noise. We propose a novel generative artificial intelligence (AI) based SemCom architecture conditioned on quantized latent. The system reduces the communication overhead of the wireless channel by transmitting the index of the quantized latent over the communication channel by mapping the quantized vector to the learned codebook vectors. The learned codebook is the shared knowledge base. The encoder is designed with a novel spatial attention mechanism based on image energy, focusing on object edges. The critic assesses the realism of generated data relative to the original distribution, with the Wasserstein distance. The model introduces novel contrastive objectives at multiple levels, including pixel, latent, perceptual, and task output, tailored for noisy wireless semantic communication. We validated the proposed model for transmission quality and robustness with low-density parity-check (LDPC), which outperforms the baselines of better portable graphics (BPG), specifically at low signal-to-noise ratio (SNR) levels (<inline-formula> <tex-math>$ {lt }~ {5}$ </tex-math></inline-formula> dB). Additionally, it shows comparable results with joint source-channel coding (JSCC) with lower complexity and latency. The model is validated for human perception and machine perception-oriented task utility. The model effectively transmits high-resolution images without requiring additional error correction at the receiver. We propose a novel semantic-based matrix to evaluate the robustness to noise and task-specific semantic distortion.","PeriodicalId":100641,"journal":{"name":"IEEE Transactions on Machine Learning in Communications and Networking","volume":"3 ","pages":"1050-1074"},"PeriodicalIF":0.0,"publicationDate":"2025-09-09","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://ieeexplore.ieee.org/stamp/stamp.jsp?tp=&arnumber=11154002","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"145100350","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Robust Defensive Cyber Agent for Multi-Adversary Defense","authors":"Muhammad O. Farooq","doi":"10.1109/TMLCN.2025.3605855","DOIUrl":"https://doi.org/10.1109/TMLCN.2025.3605855","url":null,"abstract":"Modern cyber environments are becoming increasingly complex and distributed, often organized into multiple interconnected subnets and nodes. Even relatively small-scale networks can exhibit significant security challenges due to their dynamic topologies and the diversity of potential attack vectors. In modern cyber environments, human-led defense alone is insufficient due to delayed response times, cognitive overload, and limited availability of skilled personnel, particularly in remote or resource-constrained settings. These challenges are intensified by the growing diversity of cyber threats, including adaptive and machine learning-based attacks, which demand rapid and intelligent responses. Addressing this, we propose a reinforcement learning (RL)-based framework that integrates eXtreme Gradient Boosting (XGBoost) and transformer architectures to develop robust, generalizable defensive agents. The proposed agents are evaluated against both baseline defenders trained to counter specific adversaries and hierarchical generic agents representing the current state-of-the-art. Experimental results demonstrate that the RL-XGBoost (integration of RL and XGBoost) agent consistently achieves superior performance in terms of defense accuracy and efficiency across varied adversarial strategies and network configurations. Notably, in scenarios involving changes to network topology, both RL-Transformer (RL combined with transformer architectures) and RL-XGBoost agents exhibit strong adaptability and resilience, outperforming specialized blue agents and hierarchical agents in performance consistency. In particular, the RL-Transformer variant (RL-BERT) demonstrates exceptional robustness when attacker entry points are altered, effectively capturing long-range dependencies and temporal patterns through its self-attention mechanism. Overall, these findings highlight the RL-XGBoost model’s potential as a scalable and intelligent solution for multi-adversary defense in dynamic and heterogeneous cyber environments.","PeriodicalId":100641,"journal":{"name":"IEEE Transactions on Machine Learning in Communications and Networking","volume":"3 ","pages":"1030-1049"},"PeriodicalIF":0.0,"publicationDate":"2025-09-03","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://ieeexplore.ieee.org/stamp/stamp.jsp?tp=&arnumber=11150430","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"145036224","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Reverse Engineering Segment Routing Policies and Link Costs With Inverse Reinforcement Learning and EM","authors":"Kai Wang;Chee Wei Tan","doi":"10.1109/TMLCN.2025.3598739","DOIUrl":"https://doi.org/10.1109/TMLCN.2025.3598739","url":null,"abstract":"Network routing is a core functionality in computer networks that holds significant potential for integrating newly developed techniques with minimal software effort through the use of Software-Defined Networking (SDN). However, with the ever-expansion of the Internet, traditional destination-based IP routing techniques struggle to meet Quality-of-Service (QoS) requirements with SDN alone. To address these challenges, a modern network routing technique called Segment Routing (SR) has been designed to simplify traffic engineering and make networks more flexible and scalable. However, existing SR routing algorithms used by major Internet Service Providers (ISPs) are mostly proprietary, whose details remain unknown. This study delves into the inverse problem of a general type of SR and attempts to infer the SR policies given expert traffic traces. To this end, we propose MoME, a Mixture-of-Experts (MoE) model using the Maximum Entropy Inverse Reinforcement Learning (MaxEnt-IRL) framework that is capable of incorporating diverse features (e.g., router, link and context) and capturing complex relationships in the link cost, in combination with an Expectation-Maximization (EM) based iterative algorithm that jointly infers link costs and SR policy classes. Experimental results on real-world ISP topologies and Traffic Matrices (TMs) demonstrate the superior performance of our approach in jointly classifying SR policies and inferring link cost functions. Specifically, our model achieves classification accuracies of 0.90, 0.81, 0.75, and 0.57 on datasets that contain five SR policies over the small-scale Abilene and GÉANT, the medium-scale Exodus, and the large-scale Sprintlink network topologies, respectively.","PeriodicalId":100641,"journal":{"name":"IEEE Transactions on Machine Learning in Communications and Networking","volume":"3 ","pages":"1014-1029"},"PeriodicalIF":0.0,"publicationDate":"2025-08-13","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://ieeexplore.ieee.org/stamp/stamp.jsp?tp=&arnumber=11124467","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"144891104","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Abdul Karim Gizzini;Yahia Medjahdi;Ali J. Ghandour;Laurent Clavier
{"title":"Explainable AI for Enhancing Efficiency of DL-Based Channel Estimation","authors":"Abdul Karim Gizzini;Yahia Medjahdi;Ali J. Ghandour;Laurent Clavier","doi":"10.1109/TMLCN.2025.3596548","DOIUrl":"https://doi.org/10.1109/TMLCN.2025.3596548","url":null,"abstract":"The support of artificial intelligence (AI) based decision-making is a key element in future 6G networks. Moreover, AI is widely employed in critical applications such as autonomous driving and medical diagnosis. In such applications, using AI as black-box models is risky and challenging. Hence, it is crucial to understand and trust the decisions taken by these models. Tackling this issue can be achieved by developing explainable AI (XAI) schemes that aim to explain the logic behind the black-box model behavior, and thus, ensure its efficient and safe deployment. Highlighting the relevant inputs the black-box model uses to accomplish the desired prediction is essential towards ensuring its interpretability. Recently, we proposed a novel perturbation-based feature selection framework called XAI-CHEST and oriented toward channel estimation in wireless communications. This manuscript provides the detailed theoretical foundations of the XAI-CHEST framework. In particular, we derive the analytical expressions of the XAI-CHEST loss functions and the noise threshold fine-tuning optimization problem. Hence the designed XAI-CHEST delivers a smart low-complex one-shot input feature selection methodology for high-dimensional model input that can further improve the overall performance while optimizing the architecture of the employed model. Simulation results show that the XAI-CHEST framework outperforms the classical feature selection XAI schemes such as local interpretable model-agnostic explanations (LIME) and shapley additive explanations (SHAP), mainly in terms of interpretability resolution as well as providing better performance-complexity trade-off.","PeriodicalId":100641,"journal":{"name":"IEEE Transactions on Machine Learning in Communications and Networking","volume":"3 ","pages":"976-996"},"PeriodicalIF":0.0,"publicationDate":"2025-08-06","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://ieeexplore.ieee.org/stamp/stamp.jsp?tp=&arnumber=11115091","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"144858638","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Out-of-Distribution in Image Semantic Communication: A Solution With Multimodal Large Language Models","authors":"Feifan Zhang;Yuyang Du;Kexin Chen;Yulin Shao;Soung Chang Liew","doi":"10.1109/TMLCN.2025.3595841","DOIUrl":"https://doi.org/10.1109/TMLCN.2025.3595841","url":null,"abstract":"Semantic communication is a promising technology for next-generation wireless networks. However, the out-of-distribution (OOD) problem, where a pre-trained machine learning (ML) model is applied to unseen tasks that are outside the distribution of its training data, may compromise the integrity of semantic compression. This paper explores the use of multi-modal large language models (MLLMs) to address the OOD issue in image semantic communication. We propose a novel “Plan A - Plan B” framework that leverages the broad knowledge and strong generalization ability of an MLLM to assist a conventional ML model when the latter encounters an OOD input in the semantic encoding process. Furthermore, we propose a Bayesian optimization scheme that reshapes the probability distribution of the MLLM’s inference process based on the contextual information of the image. The optimization scheme significantly enhances the MLLM’s performance in semantic compression by 1) filtering out irrelevant vocabulary in the original MLLM output; and 2) using contextual similarities between prospective answers of the MLLM and the background information as prior knowledge to modify the MLLM’s probability distribution during inference. Further, at the receiver side of the communication system, we put forth a “generate-criticize” framework that utilizes the cooperation of multiple MLLMs to enhance the reliability of image reconstruction.","PeriodicalId":100641,"journal":{"name":"IEEE Transactions on Machine Learning in Communications and Networking","volume":"3 ","pages":"997-1013"},"PeriodicalIF":0.0,"publicationDate":"2025-08-05","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://ieeexplore.ieee.org/stamp/stamp.jsp?tp=&arnumber=11113346","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"144858657","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"A Hierarchical Feature-Based Time Series Clustering Approach for Data-Driven Capacity Planning of Cellular Networks","authors":"Vineeta Jain;Anna Richter;Vladimir Fokow;Mathias Schweigel;Ulf Wetzker;Andreas Frotzscher","doi":"10.1109/TMLCN.2025.3595125","DOIUrl":"https://doi.org/10.1109/TMLCN.2025.3595125","url":null,"abstract":"The growing popularity of cellular networks among users, primarily due to affordable prices and high speeds, has escalated the need for strategic capacity planning to ensure a seamless end-user experience and profitable returns on network investments. Traditional capacity planning methods rely on static analysis of network parameters with the aim of minimizing the CAPEX and the OPEX. However, to address the evolving dynamics of cellular networks, this paper advocates for a data-driven approach that considers user behavioral analysis in the planning process to make it proactive and adaptive. We introduce a Hierarchical Feature-based Time Series Clustering (HFTSC) approach that organizes clustering in a multi-level tree structure. Each level addresses a specific aspect of time series data using focused features, enabling explainable clustering. The proposed approach assigns labels to clusters based on the time series properties targeted at each level, generating annotated clusters while applying unsupervised clustering methods. To evaluate the effectiveness of HFTSC, we conduct a comprehensive case study using real-world data from thousands of network elements. Our evaluation examines the identified clusters from analytical and geographical perspectives, focusing on supporting network planners in data-informed decision-making and analysis. Finally, we perform an extensive comparison with several baseline methods to reflect the practical advantages of our approach in capacity planning and optimization.","PeriodicalId":100641,"journal":{"name":"IEEE Transactions on Machine Learning in Communications and Networking","volume":"3 ","pages":"921-947"},"PeriodicalIF":0.0,"publicationDate":"2025-08-04","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://ieeexplore.ieee.org/stamp/stamp.jsp?tp=&arnumber=11108703","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"144831870","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Hang Zou;Qiyang Zhao;Yu Tian;Lina Bariah;Faouzi Bader;Thierry Lestable;Merouane Debbah
{"title":"TelecomGPT: A Framework to Build Telecom-Specific Large Language Models","authors":"Hang Zou;Qiyang Zhao;Yu Tian;Lina Bariah;Faouzi Bader;Thierry Lestable;Merouane Debbah","doi":"10.1109/TMLCN.2025.3593184","DOIUrl":"https://doi.org/10.1109/TMLCN.2025.3593184","url":null,"abstract":"The emergent field of Large Language Models (LLMs) has significant potential to revolutionize how future telecom networks are designed and operated. However, mainstream Large Language Models (LLMs) lack the specialized knowledge required to understand and operate within the highly technical telecom domain. In this paper, we introduce TelecomGPT, the first telecom-specific LLM, built through a systematic adaptation pipeline designed to enhance general-purpose LLMs for telecom applications. To achieve this, we curate comprehensive telecom-specific datasets, including pre-training datasets, instruction datasets, and preference datasets. These datasets are leveraged for continual pre-training, instruction tuning, and alignment tuning, respectively. Additionally, due to the lack of widely accepted evaluation benchmarks that are tailored for the telecom domain, we proposed three novel LLM-Telecom evaluation benchmarks, namely, Telecom Math Modeling, Telecom Open QnA, and Telecom Code Tasks. These new benchmarks provide a holistic evaluation of the capabilities of LLMs in telecom math modeling, open-ended question answering, code generation, infilling, summarization and analysis. Using the curated datasets, our fine-tuned LLM, TelecomGPT, significantly outperforms general-purpose state of the art (SOTA) LLMs, including GPT-4, Llama-3 and Mistral, particularly in Telecom Math Modeling benchmarks. Additionally, it achieves comparable performance across various evaluation benchmarks, such as TeleQnA, 3GPP technical document classification, telecom code summarization, generation, and infilling. This work establishes a new foundation for integrating LLMs into telecom systems, paving the way for AI-powered advancements in network operations.","PeriodicalId":100641,"journal":{"name":"IEEE Transactions on Machine Learning in Communications and Networking","volume":"3 ","pages":"948-975"},"PeriodicalIF":0.0,"publicationDate":"2025-07-28","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://ieeexplore.ieee.org/stamp/stamp.jsp?tp=&arnumber=11097898","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"144858656","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"ORANSight-2.0: Foundational LLMs for O-RAN","authors":"Pranshav Gajjar;Vijay K. Shah","doi":"10.1109/TMLCN.2025.3592658","DOIUrl":"https://doi.org/10.1109/TMLCN.2025.3592658","url":null,"abstract":"Despite the transformative impact of Large Language Models (LLMs) across critical domains such as healthcare, customer service, and business marketing, their integration into Open Radio Access Networks (O-RAN) remains limited. This gap is primarily due to the absence of domain-specific foundational models, with existing solutions often relying on general-purpose LLMs that fail to address the unique challenges and technical intricacies of O-RAN. To bridge this gap, we introduce ORANSight-2.0 (O-RAN Insights), a pioneering initiative to develop specialized foundational LLMs tailored for O-RAN. Built on 18 models spanning five open-source LLM frameworks—Mistral, Qwen, Llama, Phi, and Gemma—ORANSight-2.0 fine-tunes models ranging from 1B to 70B parameters, significantly reducing reliance on proprietary, closed-source models while enhancing performance in O-RAN-specific tasks. At the core of ORANSight-2.0 is RANSTRUCT, a novel Retrieval-Augmented Generation (RAG)-based instruction-tuning framework that employs two LLM agents—a Mistral-based Question Generator and a Qwen-based Answer Generator—to create high-quality instruction-tuning datasets. The generated dataset is then used to fine-tune the 18 pre-trained open-source LLMs via QLoRA. To evaluate ORANSight-2.0, we introduce srsRANBench, a novel benchmark designed for code generation and codebase understanding in the context of srsRAN, a widely used 5G O-RAN stack. Additionally, we leverage ORAN-Bench-13K, an existing benchmark for assessing O-RAN-specific knowledge. Our comprehensive evaluations demonstrate that ORANSight-2.0 models outperform general-purpose and closed-source models, such as ChatGPT-4o and Gemini, by 5.421% on ORANBench and 18.465% on srsRANBench, achieving superior performance while maintaining lower computational and energy costs. We also experiment with RAG-augmented variants of ORANSight-2.0 models and observe that RAG augmentation improves performance by an average of 6.35% across benchmarks, achieving the best overall cumulative score of 0.854, which is 12.37% better than the leading closed-source alternative. We thoroughly evaluate the energy characteristics of ORANSight-2.0, demonstrating its efficiency in training, inference, and inference with RAG augmentation, ensuring optimal performance while maintaining low computational and energy costs. Additionally, the best ORANSight-2.0 configuration is compared against the available telecom LLMs, where our proposed model outperformed them with an average improvement of 27.96%.","PeriodicalId":100641,"journal":{"name":"IEEE Transactions on Machine Learning in Communications and Networking","volume":"3 ","pages":"903-920"},"PeriodicalIF":0.0,"publicationDate":"2025-07-25","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://ieeexplore.ieee.org/stamp/stamp.jsp?tp=&arnumber=11096935","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"144831852","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"User Handover Aware Hierarchical Federated Learning for Open RAN-Based Next-Generation Mobile Networks","authors":"Amardip Kumar Singh;Kim Khoa Nguyen","doi":"10.1109/TMLCN.2025.3587205","DOIUrl":"https://doi.org/10.1109/TMLCN.2025.3587205","url":null,"abstract":"The Open Radio Access Network (O-RAN) architecture, enhanced by its AI-enabled Radio Intelligent Controllers (RIC), offers a more flexible and intelligent solution to optimize next generation networks compared to traditional mobile network architectures. By leveraging its distributed structure, which aligns seamlessly with O-RAN’s disaggregated design, Federated Learning (FL), particularly Hierarchical FL, facilitates decentralized AI model training, improving network performance, reducing resource costs, and safeguarding user privacy. However, the dynamic nature of mobile networks, particularly the frequent handovers of User Equipment (UE) between base stations, poses significant challenges for FL model training. These challenges include managing continuously changing device sets and mitigating the impact of handover delays on global model convergence. To address these challenges, we propose MHORANFed, a novel optimization algorithm tailored to minimize learning time and resource usage costs while preserving model performance within a mobility-aware hierarchical FL framework for O-RAN. Firstly, MHORANFed simplifies the upper layer of the HFL training at edge aggregate servers, which reduces the model complexity and thereby improves the learning time and the resource usage cost. Secondly, it uses jointly optimized bandwidth resource allocation and handed over local trainers’ participation to mitigate the UE handover delay in each global round. Through a rigorous convergence analysis and extensive simulation results, this work demonstrates its superiority over existing state-of-the-art methods. Furthermore, our findings underscore significant improvements in FL training efficiency, paving the way for advanced applications such as autonomous driving and augmented reality in 5G and next-generation O-RAN networks.","PeriodicalId":100641,"journal":{"name":"IEEE Transactions on Machine Learning in Communications and Networking","volume":"3 ","pages":"848-863"},"PeriodicalIF":0.0,"publicationDate":"2025-07-10","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://ieeexplore.ieee.org/stamp/stamp.jsp?tp=&arnumber=11075644","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"144739815","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}