Federated Reinforcement Learning for Collaborative Intelligence in UAV-Assisted C-V2X Communications

Drones Pub Date : 2024-07-12 DOI:10.3390/drones8070321

Abhishek Gupta, Xavier Fernando

{"title":"Federated Reinforcement Learning for Collaborative Intelligence in UAV-Assisted C-V2X Communications","authors":"Abhishek Gupta, Xavier Fernando","doi":"10.3390/drones8070321","DOIUrl":null,"url":null,"abstract":"This paper applies federated reinforcement learning (FRL) in cellular vehicle-to-everything (C-V2X) communication to enable vehicles to learn communication parameters in collaboration with a parameter server that is embedded in an unmanned aerial vehicle (UAV). Different sensors in vehicles capture different types of data, contributing to data heterogeneity. C-V2X communication networks impose additional communication overhead in order to converge to a global model when the sensor data are not independent-and-identically-distributed (non-i.i.d.). Consequently, the training time for local model updates also varies considerably. Using FRL, we accelerated this convergence by minimizing communication rounds, and we delayed it by exploring the correlation between the data captured by various vehicles in subsequent time steps. Additionally, as UAVs have limited battery power, processing of the collected information locally at the vehicles and then transmitting the model hyper-parameters to the UAVs can optimize the available power consumption pattern. The proposed FRL algorithm updates the global model through adaptive weighing of Q-values at each training round. By measuring the local gradients at the vehicle and the global gradient at the UAV, the contribution of the local models is determined. We quantify these Q-values using nonlinear mappings to reinforce positive rewards such that the contribution of local models is dynamically measured. Moreover, minimizing the number of communication rounds between the UAVs and vehicles is investigated as a viable approach for minimizing delay. A performance evaluation revealed that the FRL approach can yield up to a 40% reduction in the number of communication rounds between vehicles and UAVs when compared to gross data offloading.","PeriodicalId":507567,"journal":{"name":"Drones","volume":"96 5","pages":""},"PeriodicalIF":0.0000,"publicationDate":"2024-07-12","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Drones","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.3390/drones8070321","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}

引用次数: 0

Abstract

This paper applies federated reinforcement learning (FRL) in cellular vehicle-to-everything (C-V2X) communication to enable vehicles to learn communication parameters in collaboration with a parameter server that is embedded in an unmanned aerial vehicle (UAV). Different sensors in vehicles capture different types of data, contributing to data heterogeneity. C-V2X communication networks impose additional communication overhead in order to converge to a global model when the sensor data are not independent-and-identically-distributed (non-i.i.d.). Consequently, the training time for local model updates also varies considerably. Using FRL, we accelerated this convergence by minimizing communication rounds, and we delayed it by exploring the correlation between the data captured by various vehicles in subsequent time steps. Additionally, as UAVs have limited battery power, processing of the collected information locally at the vehicles and then transmitting the model hyper-parameters to the UAVs can optimize the available power consumption pattern. The proposed FRL algorithm updates the global model through adaptive weighing of Q-values at each training round. By measuring the local gradients at the vehicle and the global gradient at the UAV, the contribution of the local models is determined. We quantify these Q-values using nonlinear mappings to reinforce positive rewards such that the contribution of local models is dynamically measured. Moreover, minimizing the number of communication rounds between the UAVs and vehicles is investigated as a viable approach for minimizing delay. A performance evaluation revealed that the FRL approach can yield up to a 40% reduction in the number of communication rounds between vehicles and UAVs when compared to gross data offloading.

查看原文本刊更多论文

无人机辅助 C-V2X 通信中协作智能的联合强化学习

本文将联合强化学习（FRL）应用于蜂窝式车对物（C-V2X）通信，使车辆能够与嵌入在无人驾驶飞行器（UAV）中的参数服务器协作学习通信参数。车辆中的不同传感器捕获不同类型的数据，从而导致数据异构。当传感器数据不是独立且相同分布（非 i.i.d.）时，C-V2X 通信网络会产生额外的通信开销，以便收敛到全局模型。因此，局部模型更新的训练时间也有很大差异。利用 FRL，我们通过减少通信回合来加快收敛速度，并通过探索不同飞行器在后续时间步骤中捕获的数据之间的相关性来延迟收敛时间。此外，由于无人飞行器的电池电量有限，在车辆本地处理收集到的信息，然后将模型超参数传递给无人飞行器，可以优化可用的耗电模式。拟议的 FRL 算法在每一轮训练中通过自适应权衡 Q 值来更新全局模型。通过测量飞行器的局部梯度和无人机的全局梯度，可以确定局部模型的贡献。我们使用非线性映射对这些 Q 值进行量化，以强化正奖励，从而动态衡量本地模型的贡献。此外，我们还研究了尽量减少无人飞行器和车辆之间的通信轮数，以此作为尽量减少延迟的可行方法。性能评估显示，与总数据卸载相比，FRL 方法可将车辆与无人机之间的通信轮数最多减少 40%。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

求助全文

约1分钟内获得全文求助全文

来源期刊

Drones

自引率

0.00%

发文量