边缘计算网络中多智能体协作的共享探索和奖励优化任务调度策略

IF 4.8 3区计算机科学 Q1 COMPUTER SCIENCE, INFORMATION SYSTEMS

Ad Hoc Networks Pub Date : 2025-09-17 DOI:10.1016/j.adhoc.2025.103992

Lei Jin, Junyan Chen, Rui Yao, Jiahao Chen, Xinmei Li

{"title":"边缘计算网络中多智能体协作的共享探索和奖励优化任务调度策略","authors":"Lei Jin, Junyan Chen, Rui Yao, Jiahao Chen, Xinmei Li","doi":"10.1016/j.adhoc.2025.103992","DOIUrl":null,"url":null,"abstract":"<div><div>The integration of mobile edge computing (MEC) into dynamic wireless ad hoc networks has intensified the challenges of computational resource scheduling, particularly under queue load constraints. Current research primarily focuses on global average metrics while neglecting fairness in resource allocation among devices across cross-time slot task scenarios. This oversight leads to significant disparities in the resources allocated to different devices, with some devices consistently lacking computational resources due to imbalanced scheduling. To address these limitations, we propose SEROS (Shared Exploration and Reward Optimization Strategy), a multi-agent reinforcement learning framework designed for cross-timeslot task scheduling in MEC environments. The method dynamically balances local optimization objectives with global collaboration through a weighted shared reward mechanism while enhancing training efficiency via hybrid sample trajectory utilization, enabling adaptive task offloading decisions. First, we construct a mobile edge computing model incorporating queue load constraints to address cross-timeslot task scheduling challenges, improving resource utilization for time-sensitive workloads through delayed optimization objectives. Second, we design a collaborative incentive mechanism based on global–local reward balancing and develop a sample trajectory-sharing scheme to accelerate policy convergence while preserving agent specialization. Simulation experiments validate the effectiveness of SEROS, demonstrating that compared with baseline methods, the proposed approach exhibits superior comprehensive performance in task completion rate improved by 7% and inter-device completion rate concentration enhanced by 40%, along with stability and task completion time.</div></div>","PeriodicalId":55555,"journal":{"name":"Ad Hoc Networks","volume":"179 ","pages":"Article 103992"},"PeriodicalIF":4.8000,"publicationDate":"2025-09-17","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":"{\"title\":\"SEROS: Shared exploration and reward optimization task scheduling strategy for multi-agent collaboration in edge computing networks\",\"authors\":\"Lei Jin, Junyan Chen, Rui Yao, Jiahao Chen, Xinmei Li\",\"doi\":\"10.1016/j.adhoc.2025.103992\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"<div><div>The integration of mobile edge computing (MEC) into dynamic wireless ad hoc networks has intensified the challenges of computational resource scheduling, particularly under queue load constraints. Current research primarily focuses on global average metrics while neglecting fairness in resource allocation among devices across cross-time slot task scenarios. This oversight leads to significant disparities in the resources allocated to different devices, with some devices consistently lacking computational resources due to imbalanced scheduling. To address these limitations, we propose SEROS (Shared Exploration and Reward Optimization Strategy), a multi-agent reinforcement learning framework designed for cross-timeslot task scheduling in MEC environments. The method dynamically balances local optimization objectives with global collaboration through a weighted shared reward mechanism while enhancing training efficiency via hybrid sample trajectory utilization, enabling adaptive task offloading decisions. First, we construct a mobile edge computing model incorporating queue load constraints to address cross-timeslot task scheduling challenges, improving resource utilization for time-sensitive workloads through delayed optimization objectives. Second, we design a collaborative incentive mechanism based on global–local reward balancing and develop a sample trajectory-sharing scheme to accelerate policy convergence while preserving agent specialization. Simulation experiments validate the effectiveness of SEROS, demonstrating that compared with baseline methods, the proposed approach exhibits superior comprehensive performance in task completion rate improved by 7% and inter-device completion rate concentration enhanced by 40%, along with stability and task completion time.</div></div>\",\"PeriodicalId\":55555,\"journal\":{\"name\":\"Ad Hoc Networks\",\"volume\":\"179 \",\"pages\":\"Article 103992\"},\"PeriodicalIF\":4.8000,\"publicationDate\":\"2025-09-17\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"0\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"Ad Hoc Networks\",\"FirstCategoryId\":\"94\",\"ListUrlMain\":\"https://www.sciencedirect.com/science/article/pii/S1570870525002409\",\"RegionNum\":3,\"RegionCategory\":\"计算机科学\",\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"Q1\",\"JCRName\":\"COMPUTER SCIENCE, INFORMATION SYSTEMS\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"Ad Hoc Networks","FirstCategoryId":"94","ListUrlMain":"https://www.sciencedirect.com/science/article/pii/S1570870525002409","RegionNum":3,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q1","JCRName":"COMPUTER SCIENCE, INFORMATION SYSTEMS","Score":null,"Total":0}

引用次数: 0

摘要

移动边缘计算（MEC）与动态无线自组织网络的集成加剧了计算资源调度的挑战，特别是在队列负载约束下。目前的研究主要集中在全局平均指标上，而忽略了跨时间段任务场景中设备之间资源分配的公平性。这种疏忽导致分配给不同设备的资源存在显著差异，有些设备由于调度不平衡而始终缺乏计算资源。为了解决这些限制，我们提出了SEROS（共享探索和奖励优化策略），这是一个多智能体强化学习框架，专为MEC环境中的跨时间段任务调度而设计。该方法通过加权共享奖励机制动态平衡局部优化目标与全局协作，同时通过混合样本轨迹利用提高训练效率，实现自适应任务卸载决策。首先，我们构建了一个包含队列负载约束的移动边缘计算模型，以解决跨时隙任务调度挑战，通过延迟优化目标提高时间敏感工作负载的资源利用率。其次，我们设计了一个基于全局-局部奖励平衡的协同激励机制，并开发了一个样本轨迹共享方案来加速政策收敛，同时保持代理专业化。仿真实验验证了SEROS的有效性，表明与基线方法相比，该方法在任务完成率提高7%、设备间完成率集中提高40%、稳定性和任务完成时间方面具有优越的综合性能。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

查看原文本刊更多论文

SEROS: Shared exploration and reward optimization task scheduling strategy for multi-agent collaboration in edge computing networks

The integration of mobile edge computing (MEC) into dynamic wireless ad hoc networks has intensified the challenges of computational resource scheduling, particularly under queue load constraints. Current research primarily focuses on global average metrics while neglecting fairness in resource allocation among devices across cross-time slot task scenarios. This oversight leads to significant disparities in the resources allocated to different devices, with some devices consistently lacking computational resources due to imbalanced scheduling. To address these limitations, we propose SEROS (Shared Exploration and Reward Optimization Strategy), a multi-agent reinforcement learning framework designed for cross-timeslot task scheduling in MEC environments. The method dynamically balances local optimization objectives with global collaboration through a weighted shared reward mechanism while enhancing training efficiency via hybrid sample trajectory utilization, enabling adaptive task offloading decisions. First, we construct a mobile edge computing model incorporating queue load constraints to address cross-timeslot task scheduling challenges, improving resource utilization for time-sensitive workloads through delayed optimization objectives. Second, we design a collaborative incentive mechanism based on global–local reward balancing and develop a sample trajectory-sharing scheme to accelerate policy convergence while preserving agent specialization. Simulation experiments validate the effectiveness of SEROS, demonstrating that compared with baseline methods, the proposed approach exhibits superior comprehensive performance in task completion rate improved by 7% and inter-device completion rate concentration enhanced by 40%, along with stability and task completion time.

求助全文

通过发布文献求助，成功后即可免费获取论文全文。去求助

来源期刊

Ad Hoc Networks 工程技术-电信学

CiteScore

10.20

自引率

4.20%

发文量

131

审稿时长

4.8 months

期刊介绍： The Ad Hoc Networks is an international and archival journal providing a publication vehicle for complete coverage of all topics of interest to those involved in ad hoc and sensor networking areas. The Ad Hoc Networks considers original, high quality and unpublished contributions addressing all aspects of ad hoc and sensor networks. Specific areas of interest include, but are not limited to: Mobile and Wireless Ad Hoc Networks Sensor Networks Wireless Local and Personal Area Networks Home Networks Ad Hoc Networks of Autonomous Intelligent Systems Novel Architectures for Ad Hoc and Sensor Networks Self-organizing Network Architectures and Protocols Transport Layer Protocols Routing protocols (unicast, multicast, geocast, etc.) Media Access Control Techniques Error Control Schemes Power-Aware, Low-Power and Energy-Efficient Designs Synchronization and Scheduling Issues Mobility Management Mobility-Tolerant Communication Protocols Location Tracking and Location-based Services Resource and Information Management Security and Fault-Tolerance Issues Hardware and Software Platforms, Systems, and Testbeds Experimental and Prototype Results Quality-of-Service Issues Cross-Layer Interactions Scalability Issues Performance Analysis and Simulation of Protocols.