Joint trajectory and offloading optimization in UAV-assisted MEC via federated multi-agent reinforcement learning and potential fields

IF 4.6 2区计算机科学 Q1 COMPUTER SCIENCE, HARDWARE & ARCHITECTURE

Computer Networks Pub Date : 2025-09-10 DOI:10.1016/j.comnet.2025.111681

Cong Wang , Ke Liu , Ying Yuan , Sancheng Peng , Guorui Li

{"title":"Joint trajectory and offloading optimization in UAV-assisted MEC via federated multi-agent reinforcement learning and potential fields","authors":"Cong Wang , Ke Liu , Ying Yuan , Sancheng Peng , Guorui Li","doi":"10.1016/j.comnet.2025.111681","DOIUrl":null,"url":null,"abstract":"<div><div>Unmanned Aerial Vehicles (UAVs) assisted mobile edge computing (MEC) is characterized by flexible deployment, high mobility, and dynamic coverage. It facilitates an efficient execution of latency-sensitive tasks in scenarios such as emergency rescue and dynamic computility support, thereby demonstrating significant application prospect. However, joint scheduling of computility and task is still an open issue in optimizing task efficiency and UAVs’ energy consumption. To address this problem, we propose an UAV-assisted MEC framework based on federated multi-agent reinforcement learning (MARL) and potential fields (PF), which jointly optimizes UAV trajectories and task offloading strategies to minimize age of information (AoI) under latency and energy constraints. The decision-making process of multiple UAVs is to be modeled as a Partially Observable Markov Decision Process (POMDP) and to be solved by using a distributed federated MARL architecture. An adaptive federated collaboration model is designed for periodic parameter sharing based on credit allocation to enhance UAV collaboration and to alleviate partial observability. Additionally, a deep reinforcement learning (DRL) trajectory planning algorithm based on PF to enhance agents’ environment perception and decision-making ability. Experimental results show the effectiveness and feasibility of our proposed framework. It outperforms several existing RL-based approaches in terms of data freshness, task efficiency, and other key metrics while demonstrating strong adaptability in dynamic and complex MEC environments.</div></div>","PeriodicalId":50637,"journal":{"name":"Computer Networks","volume":"272 ","pages":"Article 111681"},"PeriodicalIF":4.6000,"publicationDate":"2025-09-10","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Computer Networks","FirstCategoryId":"94","ListUrlMain":"https://www.sciencedirect.com/science/article/pii/S1389128625006486","RegionNum":2,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q1","JCRName":"COMPUTER SCIENCE, HARDWARE & ARCHITECTURE","Score":null,"Total":0}

引用次数: 0

Abstract

Unmanned Aerial Vehicles (UAVs) assisted mobile edge computing (MEC) is characterized by flexible deployment, high mobility, and dynamic coverage. It facilitates an efficient execution of latency-sensitive tasks in scenarios such as emergency rescue and dynamic computility support, thereby demonstrating significant application prospect. However, joint scheduling of computility and task is still an open issue in optimizing task efficiency and UAVs’ energy consumption. To address this problem, we propose an UAV-assisted MEC framework based on federated multi-agent reinforcement learning (MARL) and potential fields (PF), which jointly optimizes UAV trajectories and task offloading strategies to minimize age of information (AoI) under latency and energy constraints. The decision-making process of multiple UAVs is to be modeled as a Partially Observable Markov Decision Process (POMDP) and to be solved by using a distributed federated MARL architecture. An adaptive federated collaboration model is designed for periodic parameter sharing based on credit allocation to enhance UAV collaboration and to alleviate partial observability. Additionally, a deep reinforcement learning (DRL) trajectory planning algorithm based on PF to enhance agents’ environment perception and decision-making ability. Experimental results show the effectiveness and feasibility of our proposed framework. It outperforms several existing RL-based approaches in terms of data freshness, task efficiency, and other key metrics while demonstrating strong adaptability in dynamic and complex MEC environments.

查看原文本刊更多论文

基于联合多智能体强化学习和势场的无人机辅助MEC联合轨迹与卸载优化

无人机（uav）辅助移动边缘计算（MEC）具有灵活部署、高移动性和动态覆盖的特点。它有助于在紧急救援和动态计算支持等场景中高效执行对延迟敏感的任务，具有重要的应用前景。然而，在优化任务效率和无人机能耗方面，可计算性与任务的联合调度仍然是一个有待解决的问题。为了解决这一问题，我们提出了一种基于联合多智能体强化学习（MARL）和势场（PF）的无人机辅助MEC框架，该框架在延迟和能量约束下共同优化无人机轨迹和任务卸载策略，以最小化信息年龄（AoI）。将多架无人机的决策过程建模为部分可观察马尔可夫决策过程（POMDP），并采用分布式联邦马尔可夫决策体系进行求解。设计了一种基于信用分配的周期性参数共享自适应联邦协作模型，以增强无人机的协作能力并缓解部分可观测性。此外，提出了一种基于PF的深度强化学习（DRL）轨迹规划算法，以增强智能体的环境感知和决策能力。实验结果表明了该框架的有效性和可行性。它在数据新鲜度、任务效率和其他关键指标方面优于几种现有的基于强化学习的方法，同时在动态和复杂的MEC环境中表现出强大的适应性。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

求助全文

约1分钟内获得全文求助全文

来源期刊

Computer Networks 工程技术-电信学

CiteScore

10.80

自引率

3.60%

发文量

434

审稿时长

8.6 months

期刊介绍： Computer Networks is an international, archival journal providing a publication vehicle for complete coverage of all topics of interest to those involved in the computer communications networking area. The audience includes researchers, managers and operators of networks as well as designers and implementors. The Editorial Board will consider any material for publication that is of interest to those groups.