Task offloading and multi-cache placement based on DRL in UAV-assisted MEC networks

IF 6.5 2区计算机科学 Q1 TELECOMMUNICATIONS

Vehicular Communications Pub Date : 2025-02-26 DOI:10.1016/j.vehcom.2025.100900

Kai Xue, Linbo Zhai, Yumei Li, Zekun Lu, Wenjie Zhou

{"title":"Task offloading and multi-cache placement based on DRL in UAV-assisted MEC networks","authors":"Kai Xue, Linbo Zhai, Yumei Li, Zekun Lu, Wenjie Zhou","doi":"10.1016/j.vehcom.2025.100900","DOIUrl":null,"url":null,"abstract":"<div><div>Unmanned aerial vehicles (UAVs) are being developed as a promising technology to assist mobile edge computing (MEC) systems due to their reliable wireless communication, flexible computing service capabilities, and flexible deployment. However, in the face of huge information and demanding task delay, it is a challenging problem to reduce the system cost. This paper studies task offloading and cache space placement for ground users, and proposes a multi-UAV assisted computing framework, which is a four-layer transmission system composed of ground users (UE), UAVs, edge data centers (EDC) and remote clouds. By jointly optimizing UAV cache space, flight path, offloading decision, channel ratio, and battery power, we formulate the problem to minimize the long-term average weighted cost of the system under the constraint of cache space and computing resources. Since this problem is a mixed integer variable problem, we design a task offloading and cache placement algorithm based on deep reinforcement learning, namely the Cooperative Long-term Average Cost Minimization Optimization Algorithm (CLACMO). Firstly, we transform the mixed action variable space by using embedded tables and conditional variational autoencoders (VAE) combined with latent space, and map the mixed action variable to the latent action space. This approach effectively unifies discrete and continuous actions, addressing the challenge of mixed action spaces that traditional deep reinforcement learning algorithms struggle with. Secondly, based on the deep reinforcement learning (DRL), we achieve the long-term system average weighted cost minimization more efficiently under the constraints of task offloading and cache placement. The results show that compared with the PER-UOS-RL, MASAC, and MADDPG algorithms, the average reward has increased by 54.5%, 66.7%, and 69.7% respectively, and the average task completion rate has increased by 12.9%, 38.1%, and 9.11% respectively, demonstrating the effectiveness of our novel method.</div></div>","PeriodicalId":54346,"journal":{"name":"Vehicular Communications","volume":"53 ","pages":"Article 100900"},"PeriodicalIF":6.5000,"publicationDate":"2025-02-26","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Vehicular Communications","FirstCategoryId":"94","ListUrlMain":"https://www.sciencedirect.com/science/article/pii/S2214209625000270","RegionNum":2,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q1","JCRName":"TELECOMMUNICATIONS","Score":null,"Total":0}

引用次数: 0

Abstract

Unmanned aerial vehicles (UAVs) are being developed as a promising technology to assist mobile edge computing (MEC) systems due to their reliable wireless communication, flexible computing service capabilities, and flexible deployment. However, in the face of huge information and demanding task delay, it is a challenging problem to reduce the system cost. This paper studies task offloading and cache space placement for ground users, and proposes a multi-UAV assisted computing framework, which is a four-layer transmission system composed of ground users (UE), UAVs, edge data centers (EDC) and remote clouds. By jointly optimizing UAV cache space, flight path, offloading decision, channel ratio, and battery power, we formulate the problem to minimize the long-term average weighted cost of the system under the constraint of cache space and computing resources. Since this problem is a mixed integer variable problem, we design a task offloading and cache placement algorithm based on deep reinforcement learning, namely the Cooperative Long-term Average Cost Minimization Optimization Algorithm (CLACMO). Firstly, we transform the mixed action variable space by using embedded tables and conditional variational autoencoders (VAE) combined with latent space, and map the mixed action variable to the latent action space. This approach effectively unifies discrete and continuous actions, addressing the challenge of mixed action spaces that traditional deep reinforcement learning algorithms struggle with. Secondly, based on the deep reinforcement learning (DRL), we achieve the long-term system average weighted cost minimization more efficiently under the constraints of task offloading and cache placement. The results show that compared with the PER-UOS-RL, MASAC, and MADDPG algorithms, the average reward has increased by 54.5%, 66.7%, and 69.7% respectively, and the average task completion rate has increased by 12.9%, 38.1%, and 9.11% respectively, demonstrating the effectiveness of our novel method.

查看原文本刊更多论文

无人机辅助MEC网络中基于DRL的任务卸载和多缓存放置

无人机（uav）由于其可靠的无线通信、灵活的计算服务能力和灵活的部署，正被开发为辅助移动边缘计算（MEC）系统的一种有前途的技术。然而，面对海量的信息和苛刻的任务延迟，如何降低系统成本是一个具有挑战性的问题。研究了地面用户的任务卸载和缓存空间布局，提出了一种多无人机辅助计算框架，即由地面用户（UE）、无人机、边缘数据中心（EDC）和远程云组成的四层传输系统。通过对无人机缓存空间、飞行路径、卸载决策、通道比和电池电量进行联合优化，提出了在缓存空间和计算资源约束下最小化系统长期平均加权成本的问题。由于该问题是一个混合整数变量问题，我们设计了一种基于深度强化学习的任务卸载和缓存放置算法，即合作长期平均成本最小化优化算法（CLACMO）。首先，利用嵌入式表和条件变分自编码器（VAE）结合潜在空间对混合动作变量空间进行变换，并将混合动作变量映射到潜在动作空间；这种方法有效地统一了离散和连续的动作，解决了传统深度强化学习算法所面临的混合动作空间的挑战。其次，基于深度强化学习（DRL），在任务卸载和缓存放置的约束下，更有效地实现了系统长期平均加权成本最小化。结果表明，与PER-UOS-RL、MASAC和MADDPG算法相比，平均奖励分别提高了54.5%、66.7%和69.7%，平均任务完成率分别提高了12.9%、38.1%和9.11%，证明了该方法的有效性。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

求助全文

约1分钟内获得全文求助全文

来源期刊

Vehicular Communications Engineering-Electrical and Electronic Engineering

CiteScore

12.70

自引率

10.40%

发文量

审稿时长

62 days

期刊介绍： Vehicular communications is a growing area of communications between vehicles and including roadside communication infrastructure. Advances in wireless communications are making possible sharing of information through real time communications between vehicles and infrastructure. This has led to applications to increase safety of vehicles and communication between passengers and the Internet. Standardization efforts on vehicular communication are also underway to make vehicular transportation safer, greener and easier. The aim of the journal is to publish high quality peer–reviewed papers in the area of vehicular communications. The scope encompasses all types of communications involving vehicles, including vehicle–to–vehicle and vehicle–to–infrastructure. The scope includes (but not limited to) the following topics related to vehicular communications: Vehicle to vehicle and vehicle to infrastructure communications Channel modelling, modulating and coding Congestion Control and scalability issues Protocol design, testing and verification Routing in vehicular networks Security issues and countermeasures Deployment and field testing Reducing energy consumption and enhancing safety of vehicles Wireless in–car networks Data collection and dissemination methods Mobility and handover issues Safety and driver assistance applications UAV Underwater communications Autonomous cooperative driving Social networks Internet of vehicles Standardization of protocols.