CCS-MASAC Resource Allocation Method for Collaborative Cluster Satellite Systems in 6G

IF 8.9 1区计算机科学 Q1 COMPUTER SCIENCE, INFORMATION SYSTEMS

IEEE Internet of Things Journal Pub Date : 2025-03-30 DOI:10.1109/JIOT.2025.3575158

Jiayi He;Zhiyong Liu;Jiayi Wang;Juan Dong;Qinyu Zhang

{"title":"CCS-MASAC Resource Allocation Method for Collaborative Cluster Satellite Systems in 6G","authors":"Jiayi He;Zhiyong Liu;Jiayi Wang;Juan Dong;Qinyu Zhang","doi":"10.1109/JIOT.2025.3575158","DOIUrl":null,"url":null,"abstract":"The collaborative cluster satellite system (CCS) within the 6G network establishes the foundation for robust services in the future Star-Earth integrated network by coordinating multiple low-earth orbit (LEO) satellites for collaborative observation missions and efficient space mission processing. This article proposes a model-based soft actor-critic (SAC) algorithm, CCS-MASAC, for optimizing throughput in clustered satellite systems within 6G networks. The algorithm integrates the clustering degree of CCS with the entropy regularization term in SAC, proposing an adaptive adjustment method. Unlike existing studies, in this work, we adopt an environment model-based policy optimization approach for the first time. Model-based policy optimization focuses on improving the sample efficiency of reinforcement learning (RL) algorithms. It allows agents to learn iteratively in both real and simulated environments, which improves sample efficiency, convergence, and algorithm robustness. To address the dimensionality explosion in single-agent RL algorithms, we extend this approach to a multiagent RL algorithm by defining observable neighborhoods for each agent, further enhancing performance. Simulation results indicate that the CCS-MASAC algorithm proposed in this article enhances throughput by 15%–20% and accelerates convergence by 30% compared to existing algorithms, including the multiagent deep Q-network (MADQN), multiagent proximal policy optimization (MAPPO), multiagent deep deterministic policy gradient (MADDPG) and multiagent double and dueling deep Q-learning (MAD3QL). The scalability and robustness of the algorithms are verified by scalability experiments and experiments under dynamic channel conditions. This research provides new solutions for throughput optimization and resource management in CCS systems.","PeriodicalId":54347,"journal":{"name":"IEEE Internet of Things Journal","volume":"12 15","pages":"31797-31812"},"PeriodicalIF":8.9000,"publicationDate":"2025-03-30","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"IEEE Internet of Things Journal","FirstCategoryId":"94","ListUrlMain":"https://ieeexplore.ieee.org/document/11018608/","RegionNum":1,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q1","JCRName":"COMPUTER SCIENCE, INFORMATION SYSTEMS","Score":null,"Total":0}

引用次数: 0

Abstract

The collaborative cluster satellite system (CCS) within the 6G network establishes the foundation for robust services in the future Star-Earth integrated network by coordinating multiple low-earth orbit (LEO) satellites for collaborative observation missions and efficient space mission processing. This article proposes a model-based soft actor-critic (SAC) algorithm, CCS-MASAC, for optimizing throughput in clustered satellite systems within 6G networks. The algorithm integrates the clustering degree of CCS with the entropy regularization term in SAC, proposing an adaptive adjustment method. Unlike existing studies, in this work, we adopt an environment model-based policy optimization approach for the first time. Model-based policy optimization focuses on improving the sample efficiency of reinforcement learning (RL) algorithms. It allows agents to learn iteratively in both real and simulated environments, which improves sample efficiency, convergence, and algorithm robustness. To address the dimensionality explosion in single-agent RL algorithms, we extend this approach to a multiagent RL algorithm by defining observable neighborhoods for each agent, further enhancing performance. Simulation results indicate that the CCS-MASAC algorithm proposed in this article enhances throughput by 15%–20% and accelerates convergence by 30% compared to existing algorithms, including the multiagent deep Q-network (MADQN), multiagent proximal policy optimization (MAPPO), multiagent deep deterministic policy gradient (MADDPG) and multiagent double and dueling deep Q-learning (MAD3QL). The scalability and robustness of the algorithms are verified by scalability experiments and experiments under dynamic channel conditions. This research provides new solutions for throughput optimization and resource management in CCS systems.

查看原文本刊更多论文

6G协同集群卫星系统的CCS-MASAC资源分配方法

6G网络中的协同集群卫星系统（CCS）通过协调多颗低地球轨道（LEO）卫星进行协同观测任务和高效的空间任务处理，为未来星地一体化网络的强大服务奠定了基础。本文提出了一种基于模型的软行为者评价（SAC）算法CCS-MASAC，用于优化6G网络中集群卫星系统的吞吐量。该算法将CCS的聚类程度与SAC中的熵正则化项相结合，提出了一种自适应调整方法。与现有研究不同，本文首次采用了基于环境模型的政策优化方法。基于模型的策略优化侧重于提高强化学习（RL）算法的样本效率。它允许智能体在真实和模拟环境中迭代学习，从而提高了样本效率、收敛性和算法的鲁棒性。为了解决单智能体强化学习算法中的维度爆炸问题，我们通过为每个智能体定义可观察的邻域，将这种方法扩展到多智能体强化学习算法，进一步提高了性能。仿真结果表明，与现有的多智能体深度q网络（MADQN）、多智能体近端策略优化（MAPPO）、多智能体深度确定性策略梯度（MADDPG）和多智能体双深度q学习（MAD3QL）算法相比，本文提出的CCS-MASAC算法的吞吐量提高了15%-20%，收敛速度加快了30%。通过可扩展性实验和动态信道条件下的实验，验证了算法的可扩展性和鲁棒性。本研究为CCS系统的吞吐量优化和资源管理提供了新的解决方案。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

求助全文

约1分钟内获得全文求助全文

来源期刊

IEEE Internet of Things Journal Computer Science-Information Systems

CiteScore

17.60

自引率

13.20%

发文量

1982

期刊介绍： The EEE Internet of Things (IoT) Journal publishes articles and review articles covering various aspects of IoT, including IoT system architecture, IoT enabling technologies, IoT communication and networking protocols such as network coding, and IoT services and applications. Topics encompass IoT's impacts on sensor technologies, big data management, and future internet design for applications like smart cities and smart homes. Fields of interest include IoT architecture such as things-centric, data-centric, service-oriented IoT architecture; IoT enabling technologies and systematic integration such as sensor technologies, big sensor data management, and future Internet design for IoT; IoT services, applications, and test-beds such as IoT service middleware, IoT application programming interface (API), IoT application design, and IoT trials/experiments; IoT standardization activities and technology development in different standard development organizations (SDO) such as IEEE, IETF, ITU, 3GPP, ETSI, etc.