Qingxiao Xiu , Jun Liu , Xiangjun Liu , Jintao Wang
{"title":"卫星边缘计算网络中的计算卸载与资源分配:一种多智能体强化学习方法","authors":"Qingxiao Xiu , Jun Liu , Xiangjun Liu , Jintao Wang","doi":"10.1016/j.comnet.2025.111680","DOIUrl":null,"url":null,"abstract":"<div><div>The development of Low Earth Orbit (LEO) satellite networks and Mobile Edge Computing (MEC) technologies supports the placement of MEC servers on LEO satellites, facilitating computation offloading in remote areas where computing resources are limited. However, the on-board computing and communication resources of LEO satellite networks are similarly constrained, while the system environment remains highly dynamic and complex. Moreover, diverse task requirements often require offloading across multiple time slots, which increases the complexity of offloading decisions and resource allocation for terrestrial tasks. In this study, we model this problem within satellite edge computing networks as a partially observable Markov decision process (POMDP). To achieve effective joint optimization, we introduce a multi-agent recurrent attentional double delayed deep deterministic policy gradient (MARATD3) algorithm. First, we utilize the recurrent neural network (RNN) to summarize historical observations of users, which improves adaptability to dynamic system environments and enables accurate predictions of system states. Then, a multi-head attention mechanism is introduced to strengthen the ability of user agents to capture critical information within the joint state space, reduce interference from irrelevant information, and improve training efficiency. According to the experimental results, MARATD3 achieves a considerable reduction in energy consumption and delay relative to the baseline algorithms while maintaining task delay and resource constraints.</div></div>","PeriodicalId":50637,"journal":{"name":"Computer Networks","volume":"272 ","pages":"Article 111680"},"PeriodicalIF":4.6000,"publicationDate":"2025-09-09","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":"{\"title\":\"Computation offloading and resource allocation in satellite edge computing networks: A multi-agent reinforcement learning approach\",\"authors\":\"Qingxiao Xiu , Jun Liu , Xiangjun Liu , Jintao Wang\",\"doi\":\"10.1016/j.comnet.2025.111680\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"<div><div>The development of Low Earth Orbit (LEO) satellite networks and Mobile Edge Computing (MEC) technologies supports the placement of MEC servers on LEO satellites, facilitating computation offloading in remote areas where computing resources are limited. However, the on-board computing and communication resources of LEO satellite networks are similarly constrained, while the system environment remains highly dynamic and complex. Moreover, diverse task requirements often require offloading across multiple time slots, which increases the complexity of offloading decisions and resource allocation for terrestrial tasks. In this study, we model this problem within satellite edge computing networks as a partially observable Markov decision process (POMDP). To achieve effective joint optimization, we introduce a multi-agent recurrent attentional double delayed deep deterministic policy gradient (MARATD3) algorithm. First, we utilize the recurrent neural network (RNN) to summarize historical observations of users, which improves adaptability to dynamic system environments and enables accurate predictions of system states. Then, a multi-head attention mechanism is introduced to strengthen the ability of user agents to capture critical information within the joint state space, reduce interference from irrelevant information, and improve training efficiency. According to the experimental results, MARATD3 achieves a considerable reduction in energy consumption and delay relative to the baseline algorithms while maintaining task delay and resource constraints.</div></div>\",\"PeriodicalId\":50637,\"journal\":{\"name\":\"Computer Networks\",\"volume\":\"272 \",\"pages\":\"Article 111680\"},\"PeriodicalIF\":4.6000,\"publicationDate\":\"2025-09-09\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"0\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"Computer Networks\",\"FirstCategoryId\":\"94\",\"ListUrlMain\":\"https://www.sciencedirect.com/science/article/pii/S1389128625006474\",\"RegionNum\":2,\"RegionCategory\":\"计算机科学\",\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"Q1\",\"JCRName\":\"COMPUTER SCIENCE, HARDWARE & ARCHITECTURE\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"Computer Networks","FirstCategoryId":"94","ListUrlMain":"https://www.sciencedirect.com/science/article/pii/S1389128625006474","RegionNum":2,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q1","JCRName":"COMPUTER SCIENCE, HARDWARE & ARCHITECTURE","Score":null,"Total":0}
Computation offloading and resource allocation in satellite edge computing networks: A multi-agent reinforcement learning approach
The development of Low Earth Orbit (LEO) satellite networks and Mobile Edge Computing (MEC) technologies supports the placement of MEC servers on LEO satellites, facilitating computation offloading in remote areas where computing resources are limited. However, the on-board computing and communication resources of LEO satellite networks are similarly constrained, while the system environment remains highly dynamic and complex. Moreover, diverse task requirements often require offloading across multiple time slots, which increases the complexity of offloading decisions and resource allocation for terrestrial tasks. In this study, we model this problem within satellite edge computing networks as a partially observable Markov decision process (POMDP). To achieve effective joint optimization, we introduce a multi-agent recurrent attentional double delayed deep deterministic policy gradient (MARATD3) algorithm. First, we utilize the recurrent neural network (RNN) to summarize historical observations of users, which improves adaptability to dynamic system environments and enables accurate predictions of system states. Then, a multi-head attention mechanism is introduced to strengthen the ability of user agents to capture critical information within the joint state space, reduce interference from irrelevant information, and improve training efficiency. According to the experimental results, MARATD3 achieves a considerable reduction in energy consumption and delay relative to the baseline algorithms while maintaining task delay and resource constraints.
期刊介绍:
Computer Networks is an international, archival journal providing a publication vehicle for complete coverage of all topics of interest to those involved in the computer communications networking area. The audience includes researchers, managers and operators of networks as well as designers and implementors. The Editorial Board will consider any material for publication that is of interest to those groups.