IEEE Transactions on Machine Learning in Communications and Networking最新文献

筛选
英文 中文
Randomized Quantization for Privacy in Resource Constrained Machine Learning at-the-Edge and Federated Learning
IEEE Transactions on Machine Learning in Communications and Networking Pub Date : 2025-03-10 DOI: 10.1109/TMLCN.2025.3550119
Ce Feng;Parv Venkitasubramaniam
{"title":"Randomized Quantization for Privacy in Resource Constrained Machine Learning at-the-Edge and Federated Learning","authors":"Ce Feng;Parv Venkitasubramaniam","doi":"10.1109/TMLCN.2025.3550119","DOIUrl":"https://doi.org/10.1109/TMLCN.2025.3550119","url":null,"abstract":"The increasing adoption of machine learning at the edge (ML-at-the-edge) and federated learning (FL) presents a dual challenge: ensuring data privacy as well as addressing resource constraints such as limited computational power, memory, and communication bandwidth. Traditional approaches typically apply differentially private stochastic gradient descent (DP-SGD) to preserve privacy, followed by quantization techniques as a post-processing step to reduce model size and communication overhead. However, this sequential framework introduces inherent drawbacks, as quantization alone lacks privacy guarantees and often introduces errors that degrade model performance. In this work, we propose randomized quantization as an integrated solution to address these dual challenges by embedding randomness directly into the quantization process. This approach enhances privacy while simultaneously reducing communication and computational overhead. To achieve this, we introduce Randomized Quantizer Projection Stochastic Gradient Descent (RQP-SGD), a method designed for ML-at-the-edge that embeds DP-SGD within a randomized quantization-based projection during model training. For federated learning, we develop Gaussian Sampling Quantization (GSQ), which integrates discrete Gaussian sampling into the quantization process to ensure local differential privacy (LDP). Unlike conventional methods that rely on Gaussian noise addition, GSQ achieves privacy through discrete Gaussian sampling while improving communication efficiency and model utility across distributed systems. Through rigorous theoretical analysis and extensive experiments on benchmark datasets, we demonstrate that these methods significantly enhance the utility-privacy trade-off and computational efficiency in both ML-at-the-edge and FL systems. RQP-SGD is evaluated on MNIST and the Breast Cancer Diagnostic dataset, showing an average 10.62% utility improvement over the deterministic quantization-based projected DP-SGD while maintaining (1.0, 0)-DP. In federated learning tasks, GSQ-FL improves accuracy by an average 11.52% over DP-FedPAQ across MNIST and FashionMNIST under non-IID conditions. Additionally, GSQ-FL outperforms DP-FedPAQ by 16.54% on CIFAR-10 and 8.7% on FEMNIST.","PeriodicalId":100641,"journal":{"name":"IEEE Transactions on Machine Learning in Communications and Networking","volume":"3 ","pages":"395-419"},"PeriodicalIF":0.0,"publicationDate":"2025-03-10","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://ieeexplore.ieee.org/stamp/stamp.jsp?tp=&arnumber=10919124","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"143645337","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Paths Optimization by Jointing Link Management and Channel Estimation Using Variational Autoencoder With Attention for IRS-MIMO Systems
IEEE Transactions on Machine Learning in Communications and Networking Pub Date : 2025-03-03 DOI: 10.1109/TMLCN.2025.3547689
Meng-Hsun Wu;Hong-Yunn Chen;Ta-Wei Yang;Chih-Chuan Hsu;Chih-Wei Huang;Cheng-Fu Chou
{"title":"Paths Optimization by Jointing Link Management and Channel Estimation Using Variational Autoencoder With Attention for IRS-MIMO Systems","authors":"Meng-Hsun Wu;Hong-Yunn Chen;Ta-Wei Yang;Chih-Chuan Hsu;Chih-Wei Huang;Cheng-Fu Chou","doi":"10.1109/TMLCN.2025.3547689","DOIUrl":"https://doi.org/10.1109/TMLCN.2025.3547689","url":null,"abstract":"In massive MIMO systems, achieving optimal end-to-end transmission encompasses various aspects such as power control, modulation schemes, path selection, and accurate channel estimation. Nonetheless, optimizing resource allocation remains a significant challenge. In path selection, the direct link is a straightforward link between the transmitter and the receiver. On the other hand, the indirect link involves reflections, diffraction, or scattering, often due to interactions with objects or obstacles. Relying exclusively on one type of link can lead to suboptimal and limited performance. Link management (LM) is emerging as a viable solution, and accurate channel estimation provides essential information to make informed decisions about transmission parameters. In this paper, we study LM and channel estimation that flexibly adjust the transmission ratio of direct and indirect links to improve generalization, using a denoising variational autoencoder with attention modules (DVAE-ATT) to enhance sum rate. Our experiments show significant improvements in IRS-assisted millimeter-wave MIMO systems. Incorporating LM increased the sum rate and reduced MSE by approximately 9%. Variational autoencoders (VAE) outperformed traditional autoencoders in the spatial domain, as confirmed by heatmap analysis. Additionally, our investigation of DVAE-ATT reveals notable differences in the temporal domain with and without attention mechanisms. Finally, we analyze performance across varying numbers of users and ranges. Across various distances—5m, 15m, 25m, and 35m—performance improvements averaged 6%, 11%, 16%, and 22%, respectively.","PeriodicalId":100641,"journal":{"name":"IEEE Transactions on Machine Learning in Communications and Networking","volume":"3 ","pages":"381-394"},"PeriodicalIF":0.0,"publicationDate":"2025-03-03","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://ieeexplore.ieee.org/stamp/stamp.jsp?tp=&arnumber=10909334","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"143583165","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
A Novel Multiple Access Scheme for Heterogeneous Wireless Communications Using Symmetry-Aware Continual Deep Reinforcement Learning
IEEE Transactions on Machine Learning in Communications and Networking Pub Date : 2025-02-28 DOI: 10.1109/TMLCN.2025.3546183
Hamidreza Mazandarani;Masoud Shokrnezhad;Tarik Taleb
{"title":"A Novel Multiple Access Scheme for Heterogeneous Wireless Communications Using Symmetry-Aware Continual Deep Reinforcement Learning","authors":"Hamidreza Mazandarani;Masoud Shokrnezhad;Tarik Taleb","doi":"10.1109/TMLCN.2025.3546183","DOIUrl":"https://doi.org/10.1109/TMLCN.2025.3546183","url":null,"abstract":"The Metaverse holds the potential to revolutionize digital interactions through the establishment of a highly dynamic and immersive virtual realm over wireless communications systems, offering services such as massive twinning and telepresence. This landscape presents novel challenges, particularly efficient management of multiple access to the frequency spectrum, for which numerous adaptive Deep Reinforcement Learning (DRL) approaches have been explored. However, challenges persist in adapting agents to heterogeneous and non-stationary wireless environments. In this paper, we present a novel approach that leverages Continual Learning (CL) to enhance intelligent Medium Access Control (MAC) protocols, featuring an intelligent agent coexisting with legacy User Equipments (UEs) with varying numbers, protocols, and transmission profiles unknown to the agent for the sake of backward compatibility and privacy. We introduce an adaptive Double and Dueling Deep Q-Learning (D3QL)-based MAC protocol, enriched by a symmetry-aware CL mechanism, which maximizes intelligent agent throughput while ensuring fairness. Mathematical analysis validates the efficiency of our proposed scheme, showcasing superiority over conventional DRL-based techniques in terms of throughput, collision rate, and fairness, coupled with real-time responsiveness in highly dynamic scenarios.","PeriodicalId":100641,"journal":{"name":"IEEE Transactions on Machine Learning in Communications and Networking","volume":"3 ","pages":"353-368"},"PeriodicalIF":0.0,"publicationDate":"2025-02-28","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://ieeexplore.ieee.org/stamp/stamp.jsp?tp=&arnumber=10908203","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"143570563","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
UAV-Assisted Unbiased Hierarchical Federated Learning: Performance and Convergence Analysis
IEEE Transactions on Machine Learning in Communications and Networking Pub Date : 2025-02-26 DOI: 10.1109/TMLCN.2025.3546181
Ruslan Zhagypar;Nour Kouzayha;Hesham ElSawy;Hayssam Dahrouj;Tareq Y. Al-Naffouri
{"title":"UAV-Assisted Unbiased Hierarchical Federated Learning: Performance and Convergence Analysis","authors":"Ruslan Zhagypar;Nour Kouzayha;Hesham ElSawy;Hayssam Dahrouj;Tareq Y. Al-Naffouri","doi":"10.1109/TMLCN.2025.3546181","DOIUrl":"https://doi.org/10.1109/TMLCN.2025.3546181","url":null,"abstract":"The development of the sixth-generation (6G) of wireless networks is driving computation toward the network edge, where Hierarchical Federated Learning (HFL) plays a pivotal role in distributing learning across edge devices. In HFL, edge devices train local models and send updates to an edge server for local aggregation, which are then forwarded to a central server for global aggregation. However, the unreliability of communication channels at the edge and backhaul links poses a significant bottleneck for HFL-enabled systems. To address this challenge, this paper proposes an unbiased HFL algorithm for Uncrewed Aerial Vehicle (UAV)-assisted wireless networks. While applicable to terrestrial base stations (BSs), the proposed algorithm relies on UAVs for local model aggregation thanks to their ability to enhance wireless channels with lower latency and improved coverage. The proposed algorithm adjusts update weights during local and global aggregations at UAVs to mitigate the impact of unreliable channels. To quantify channel unreliability in HFL, stochastic geometry tools are employed to assess success probabilities of local and global model parameter transmissions. Incorporating these metrics aims to mitigate biases towards devices with better channel conditions in UAV-assisted networks. The paper further examines the theoretical convergence of the proposed unbiased UAV-assisted HFL algorithm under adverse channel conditions and highlights the impact of the limited battery capacity of the UAV on the efficiency of the HFL algorithm. Additionally, the algorithm facilitates optimization of system parameters such as UAV count, altitude, battery capacity, etc. The simulation results underscore the effectiveness of the proposed unbiased HFL scheme, demonstrating a 5.5% higher accuracy and approximately 85% faster convergence compared to conventional HFL algorithms. We make our code available at the following GitHub repository: <inline-formula> <tex-math>$texttt {UAV-assisted Unbiased HFL Code}$ </tex-math></inline-formula>.","PeriodicalId":100641,"journal":{"name":"IEEE Transactions on Machine Learning in Communications and Networking","volume":"3 ","pages":"420-447"},"PeriodicalIF":0.0,"publicationDate":"2025-02-26","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://ieeexplore.ieee.org/stamp/stamp.jsp?tp=&arnumber=10904929","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"143645156","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
On Traffic Prediction With Knowledge-Driven Spatial–Temporal Graph Convolutional Network Aided by Selected Attention Mechanism
IEEE Transactions on Machine Learning in Communications and Networking Pub Date : 2025-02-26 DOI: 10.1109/TMLCN.2025.3545777
Yuwen Qian;Tianyang Qiu;Chuan Ma;Yiyang Ni;Long Yuan;Xiangwei Zhou;Jun Li
{"title":"On Traffic Prediction With Knowledge-Driven Spatial–Temporal Graph Convolutional Network Aided by Selected Attention Mechanism","authors":"Yuwen Qian;Tianyang Qiu;Chuan Ma;Yiyang Ni;Long Yuan;Xiangwei Zhou;Jun Li","doi":"10.1109/TMLCN.2025.3545777","DOIUrl":"https://doi.org/10.1109/TMLCN.2025.3545777","url":null,"abstract":"Intelligent transportation systems grapple with the formidable task of precisely forecasting real-time traffic conditions, where the traffic dynamics exhibit intricacies arising from spatial and temporal dependencies. The urban road network presents a complex web of interconnected roads, where the state of traffic on one road can influence the conditions of others. Moreover, the prediction of traffic conditions necessitates the consideration of diverse temporal factors. Notably, the proximity of a time point to the present moment wields a more substantial impact on subsequent states. In this paper, we propose the knowledge-driven graph convolutional network (KGCN) aided by the gated recurrent unit with a selected attention mechanism (GSAM) to predict traffic flow. In particular, KGCN is employed to capture the correlation of the external knowledge factors for the road and the spatial dependencies, and the gated recurrent unit (GRU) is used to cope with temporal dependence. Furthermore, to improve traffic prediction accuracy, we propose the GRU combined with a selected attention mechanism with Gumble-Max to predict traffic at the temporal dimension, where a selector is chosen to dynamically assign the feature in various time intervals with different weights. Experimental results with real-life data show the proposed KGCN with GSAM can achieve high accuracy in traffic prediction. Compared to the traditional traffic prediction method, the proposed KGCN with GSAM can achieve higher efficacy and robustness when capturing global dynamic temporal dependencies, external knowledge factor correlations, and spatial correlations.","PeriodicalId":100641,"journal":{"name":"IEEE Transactions on Machine Learning in Communications and Networking","volume":"3 ","pages":"369-380"},"PeriodicalIF":0.0,"publicationDate":"2025-02-26","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://ieeexplore.ieee.org/stamp/stamp.jsp?tp=&arnumber=10904899","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"143570620","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
RACH Traffic Prediction in Massive Machine Type Communications
IEEE Transactions on Machine Learning in Communications and Networking Pub Date : 2025-02-17 DOI: 10.1109/TMLCN.2025.3542760
Hossein Mehri;Hani Mehrpouyan;Hao Chen
{"title":"RACH Traffic Prediction in Massive Machine Type Communications","authors":"Hossein Mehri;Hani Mehrpouyan;Hao Chen","doi":"10.1109/TMLCN.2025.3542760","DOIUrl":"https://doi.org/10.1109/TMLCN.2025.3542760","url":null,"abstract":"Traffic pattern prediction has emerged as a promising approach for efficiently managing and mitigating the impacts of event-driven bursty traffic in massive machine-type communication (mMTC) networks. However, achieving accurate predictions of bursty traffic remains a non-trivial task due to the inherent randomness of events, and these challenges intensify within live network environments. Consequently, there is a compelling imperative to design a lightweight and agile framework capable of assimilating continuously collected data from the network and accurately forecasting bursty traffic in mMTC networks. This paper addresses these challenges by presenting a machine learning-based framework tailored for forecasting bursty traffic in multi-channel slotted ALOHA networks. The proposed machine learning network comprises long-term short-term memory (LSTM) and a DenseNet with feed-forward neural network (FFNN) layers, where the residual connections enhance the training ability of the machine learning network in capturing complicated patterns. Furthermore, we develop a new low-complexity online prediction algorithm that updates the states of the LSTM network by leveraging frequently collected data from the mMTC network. Simulation results and complexity analysis demonstrate the superiority of our proposed algorithm in terms of both accuracy and complexity, making it well-suited for time-critical live scenarios. We evaluate the performance of the proposed framework in a network with a single base station and thousands of devices organized into groups with distinct traffic-generating characteristics. Comprehensive evaluations and simulations indicate that our proposed machine learning approach achieves a remarkable 52% higher accuracy in long-term predictions compared to traditional methods, without imposing additional processing load on the system.","PeriodicalId":100641,"journal":{"name":"IEEE Transactions on Machine Learning in Communications and Networking","volume":"3 ","pages":"315-331"},"PeriodicalIF":0.0,"publicationDate":"2025-02-17","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://ieeexplore.ieee.org/stamp/stamp.jsp?tp=&arnumber=10891603","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"143480782","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Federated Learning-Based Collaborative Wideband Spectrum Sensing and Scheduling for UAVs in UTM Systems
IEEE Transactions on Machine Learning in Communications and Networking Pub Date : 2025-02-11 DOI: 10.1109/TMLCN.2025.3540747
Sravan Reddy Chintareddy;Keenan Roach;Kenny Cheung;Morteza Hashemi
{"title":"Federated Learning-Based Collaborative Wideband Spectrum Sensing and Scheduling for UAVs in UTM Systems","authors":"Sravan Reddy Chintareddy;Keenan Roach;Kenny Cheung;Morteza Hashemi","doi":"10.1109/TMLCN.2025.3540747","DOIUrl":"https://doi.org/10.1109/TMLCN.2025.3540747","url":null,"abstract":"In this paper, we propose a data-driven framework for collaborative wideband spectrum sensing and scheduling for networked unmanned aerial vehicles (UAVs), which act as secondary users (SUs) to opportunistically utilize detected “spectrum holes”. Our overall framework consists of three main stages. Firstly, in the model training stage, we explore dataset generation in a multi-cell environment and train a machine learning (ML) model using the federated learning (FL) architecture. Unlike the existing studies on FL for wireless that presume datasets are readily available for training, we propose an end-to-end architecture that directly integrates wireless dataset generation, which involves capturing I/Q samples from over-the-air signals in a multi-cell environment, into the FL training process. To this purpose, we propose a multi-label classification problem for wideband spectrum sensing to detect multiple spectrum holes simultaneously based on the I/Q samples collected locally by the UAVs. In the traditional FL that employs federated averaging (FedAvg) as the aggregating method, each UAV is assigned an equal weight during model aggregation. However, due to the differences in wireless channels observed at each UAV in a multi-cell environment, the received signal powers and collected datasets at different UAV locations could be significantly different, which could degrade the FL performance using equal weights. To address this issue, we propose a proportional weighted federated averaging method (pwFedAvg) in which the aggregating weights are proportional to the received signal powers at each UAV, thereby integrating the intrinsic properties of wireless channels into the FL algorithm. Secondly, in the collaborative spectrum inference stage, we propose a collaborative spectrum fusion strategy that is compatible with the unmanned aircraft system traffic management (UTM) ecosystem. In particular, we improve the accuracy of spectrum sensing results by combining the multi-label classification results from the individual UAVs by performing spectrum fusion at a central server. Finally, in the spectrum scheduling stage, we leverage reinforcement learning (RL) solutions to dynamically allocate the detected spectrum holes to the secondary users. To evaluate the proposed methods, we establish a comprehensive simulation framework that generates a near-realistic synthetic dataset using MATLAB LTE toolbox by incorporating base station (BS) locations in a chosen area of interest, performing ray-tracing, and emulating the primary user’s channel usage in terms of I/Q samples. This evaluation methodology provides a flexible framework to generate large spectrum datasets that could be used for developing ML/AI-based spectrum management solutions for aerial devices.","PeriodicalId":100641,"journal":{"name":"IEEE Transactions on Machine Learning in Communications and Networking","volume":"3 ","pages":"296-314"},"PeriodicalIF":0.0,"publicationDate":"2025-02-11","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://ieeexplore.ieee.org/stamp/stamp.jsp?tp=&arnumber=10879292","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"143480779","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Reinforcement Learning With Selective Exploration for Interference Management in mmWave Networks
IEEE Transactions on Machine Learning in Communications and Networking Pub Date : 2025-02-03 DOI: 10.1109/TMLCN.2025.3537967
Son Dinh-van;van-Linh Nguyen;Berna Bulut Cebecioglu;Antonino Masaracchia;Matthew D. Higgins
{"title":"Reinforcement Learning With Selective Exploration for Interference Management in mmWave Networks","authors":"Son Dinh-van;van-Linh Nguyen;Berna Bulut Cebecioglu;Antonino Masaracchia;Matthew D. Higgins","doi":"10.1109/TMLCN.2025.3537967","DOIUrl":"https://doi.org/10.1109/TMLCN.2025.3537967","url":null,"abstract":"The next generation of wireless systems will leverage the millimeter-wave (mmWave) bands to meet the increasing traffic volume and high data rate requirements of emerging applications (e.g., ultra HD streaming, metaverse, and holographic telepresence). In this paper, we address the joint optimization of beamforming, power control, and interference management in multi-cell mmWave networks. We propose novel reinforcement learning algorithms, including a single-agent-based method (BPC-SA) for centralized settings and a multi-agent-based method (BPC-MA) for distributed settings. To tackle the high-variance rewards caused by narrow antenna beamwidths, we introduce a selective exploration method to guide the agent towards more intelligent exploration. Our proposed algorithms are well-suited for scenarios where beamforming vectors require control in either a discrete domain, such as a codebook, or in a continuous domain. Furthermore, they do not require channel state information, extensive feedback from user equipments, or any searching methods, thus reducing overhead and enhancing scalability. Numerical results demonstrate that selective exploration improves per-user spectral efficiency by up to 22.5% compared to scenarios without it. Additionally, our algorithms significantly outperform existing methods by 50% in terms of per-user spectral effciency and achieve 90% of the per-user spectral efficiency of the exhaustive search approach while requiring only 0.1% of its computational runtime.","PeriodicalId":100641,"journal":{"name":"IEEE Transactions on Machine Learning in Communications and Networking","volume":"3 ","pages":"280-295"},"PeriodicalIF":0.0,"publicationDate":"2025-02-03","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://ieeexplore.ieee.org/stamp/stamp.jsp?tp=&arnumber=10869481","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"143422871","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Knowledge- and Model-Driven Deep Reinforcement Learning for Efficient Federated Edge Learning: Single- and Multi-Agent Frameworks
IEEE Transactions on Machine Learning in Communications and Networking Pub Date : 2025-01-27 DOI: 10.1109/TMLCN.2025.3534754
Yangchen Li;Lingzhi Zhao;Tianle Wang;Lianghui Ding;Feng Yang
{"title":"Knowledge- and Model-Driven Deep Reinforcement Learning for Efficient Federated Edge Learning: Single- and Multi-Agent Frameworks","authors":"Yangchen Li;Lingzhi Zhao;Tianle Wang;Lianghui Ding;Feng Yang","doi":"10.1109/TMLCN.2025.3534754","DOIUrl":"https://doi.org/10.1109/TMLCN.2025.3534754","url":null,"abstract":"In this paper, we investigate federated learning (FL) efficiency improvement in practical edge computing systems, where edge workers have non-independent and identically distributed (non-IID) local data, as well as dynamic and heterogeneous computing and communication capabilities. We consider a general FL algorithm with configurable parameters, including the number of local iterations, mini-batch sizes, step sizes, aggregation weights, and quantization parameters, and provide a rigorous convergence analysis. We formulate a joint optimization problem for FL worker selection and algorithm parameter configuration to minimize the final test loss subject to time and energy constraints. The resulting problem is a complicated stochastic sequential decision-making problem with an implicit objective function and unknown transition probabilities. To address these challenges, we propose knowledge/model-driven single-agent and multi-agent deep reinforcement learning (DRL) frameworks. We transform the primal problem into a Markov decision process (MDP) for the single-agent DRL framework and a decentralized partially-observable Markov decision process (Dec-POMDP) for the multi-agent DRL framework. We develop efficient single-agent and multi-agent asynchronous advantage actor-critic (A3C) approaches to solve the MDP and Dec-POMDP, respectively. In both frameworks, we design a knowledge-based reward to facilitate effective DRL and propose a model-based stochastic policy to tackle the mixed discrete-continuous actions and large action spaces. To reduce the computational complexities of policy learning and execution, we introduce a segmented actor-critic architecture for the single-agent DRL and a distributed actor-critic architecture for the multi-agent DRL. Numerical results demonstrate the effectiveness and advantages of the proposed frameworks in enhancing FL efficiency.","PeriodicalId":100641,"journal":{"name":"IEEE Transactions on Machine Learning in Communications and Networking","volume":"3 ","pages":"332-352"},"PeriodicalIF":0.0,"publicationDate":"2025-01-27","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://ieeexplore.ieee.org/stamp/stamp.jsp?tp=&arnumber=10854500","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"143480780","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Risk-Aware Reinforcement Learning Framework for User-Centric O-RAN
IEEE Transactions on Machine Learning in Communications and Networking Pub Date : 2025-01-24 DOI: 10.1109/TMLCN.2025.3534139
Shahrukh Khan Kasi;Fahd Ahmed Khan;Sabit Ekin;Ali Imran
{"title":"Risk-Aware Reinforcement Learning Framework for User-Centric O-RAN","authors":"Shahrukh Khan Kasi;Fahd Ahmed Khan;Sabit Ekin;Ali Imran","doi":"10.1109/TMLCN.2025.3534139","DOIUrl":"https://doi.org/10.1109/TMLCN.2025.3534139","url":null,"abstract":"The evolution of Open Radio Access Networks (O-RAN) presents an opportunity to enhance network performance by enabling dynamic orchestration of configuration and optimization parameters (COPs) through online learning methods. However, leveraging this potential requires overcoming the limitations of traditional cell-centric RAN architectures, which lack the necessary flexibility. On the other hand, despite their recent popularity, the practical deployment of online learning frameworks, such as Deep Reinforcement Learning (DRL)-based COP optimization solutions, remains limited due to their risk of deteriorating network performance during the exploration phase. In this article, we propose and analyze a novel risk-aware DRL framework for user-centric RAN (UC-RAN), which offers both the architectural flexibility and COP optimization to exploit this flexibility. We investigate and identify UC-RAN COPs that can be optimized via a soft actor-critic algorithm implementable as an O-RAN application (rApp) to jointly maximize latency satisfaction, reliability satisfaction, area spectral efficiency, and energy efficiency. We use the offline learning on UC-RAN to reliably accelerate DRL training, thus minimizing the risk of DRL deteriorating cellular network performance. Results show that our proposed solution approaches near-optimal performance in just a few hundred iterations with a decrease in risk score by a factor of ten.","PeriodicalId":100641,"journal":{"name":"IEEE Transactions on Machine Learning in Communications and Networking","volume":"3 ","pages":"195-214"},"PeriodicalIF":0.0,"publicationDate":"2025-01-24","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://ieeexplore.ieee.org/stamp/stamp.jsp?tp=&arnumber=10852269","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"143105944","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
0
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
确定
请完成安全验证×
相关产品
×
本文献相关产品
联系我们:info@booksci.cn Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。 Copyright © 2023 布克学术 All rights reserved.
京ICP备2023020795号-1
ghs 京公网安备 11010802042870号
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术官方微信