Meiyi Yang;Deyun Gao;Chuan Heng Foh;Wei Quan;Victor C. M. Leung
{"title":"异构网络中基于多代理强化学习的联合缓存和路由选择","authors":"Meiyi Yang;Deyun Gao;Chuan Heng Foh;Wei Quan;Victor C. M. Leung","doi":"10.1109/TCCN.2024.3391322","DOIUrl":null,"url":null,"abstract":"In this paper, we explore the problem of minimizing transmission cost among cooperative nodes by jointly optimizing caching and routing in a hybrid network with vital support of service differentiation. We show that the optimal routing policy is a \n<italic>route-to-least cost-cache</i>\n (RLC) policy for fixed caching policy. We formulate the cooperative caching problem as a multi-agent Markov decision process (MDP) with the goal of maximizing the long-term expected caching reward, which is NP-complete even when assuming users’ demand is perfectly known. To solve this problem, we propose C-MAAC, a partially decentralized multi-agent deep reinforcement learning (MADRL)-based collaborative caching algorithm employing actor-critic learning model. C-MAAC has a key characteristic of centralized training and decentralized execution, with which the challenge from unstable training process caused by simultaneous decision made by all agents can be addressed. Furthermore, we develop an optimization method as a criterion for our MADRL framework when assuming the content popularity is stationary and prior known. Our experimental results demonstrate that compared with the prior art, C-MAAC increases an average of 21.7% caching reward in dynamic environment when user request traffic changes rapidly.","PeriodicalId":13069,"journal":{"name":"IEEE Transactions on Cognitive Communications and Networking","volume":"10 5","pages":"1959-1974"},"PeriodicalIF":7.4000,"publicationDate":"2024-04-19","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":"{\"title\":\"Multi-Agent Reinforcement Learning-Based Joint Caching and Routing in Heterogeneous Networks\",\"authors\":\"Meiyi Yang;Deyun Gao;Chuan Heng Foh;Wei Quan;Victor C. M. Leung\",\"doi\":\"10.1109/TCCN.2024.3391322\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"In this paper, we explore the problem of minimizing transmission cost among cooperative nodes by jointly optimizing caching and routing in a hybrid network with vital support of service differentiation. We show that the optimal routing policy is a \\n<italic>route-to-least cost-cache</i>\\n (RLC) policy for fixed caching policy. We formulate the cooperative caching problem as a multi-agent Markov decision process (MDP) with the goal of maximizing the long-term expected caching reward, which is NP-complete even when assuming users’ demand is perfectly known. To solve this problem, we propose C-MAAC, a partially decentralized multi-agent deep reinforcement learning (MADRL)-based collaborative caching algorithm employing actor-critic learning model. C-MAAC has a key characteristic of centralized training and decentralized execution, with which the challenge from unstable training process caused by simultaneous decision made by all agents can be addressed. Furthermore, we develop an optimization method as a criterion for our MADRL framework when assuming the content popularity is stationary and prior known. Our experimental results demonstrate that compared with the prior art, C-MAAC increases an average of 21.7% caching reward in dynamic environment when user request traffic changes rapidly.\",\"PeriodicalId\":13069,\"journal\":{\"name\":\"IEEE Transactions on Cognitive Communications and Networking\",\"volume\":\"10 5\",\"pages\":\"1959-1974\"},\"PeriodicalIF\":7.4000,\"publicationDate\":\"2024-04-19\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"0\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"IEEE Transactions on Cognitive Communications and Networking\",\"FirstCategoryId\":\"94\",\"ListUrlMain\":\"https://ieeexplore.ieee.org/document/10505879/\",\"RegionNum\":1,\"RegionCategory\":\"计算机科学\",\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"Q1\",\"JCRName\":\"TELECOMMUNICATIONS\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"IEEE Transactions on Cognitive Communications and Networking","FirstCategoryId":"94","ListUrlMain":"https://ieeexplore.ieee.org/document/10505879/","RegionNum":1,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q1","JCRName":"TELECOMMUNICATIONS","Score":null,"Total":0}
Multi-Agent Reinforcement Learning-Based Joint Caching and Routing in Heterogeneous Networks
In this paper, we explore the problem of minimizing transmission cost among cooperative nodes by jointly optimizing caching and routing in a hybrid network with vital support of service differentiation. We show that the optimal routing policy is a
route-to-least cost-cache
(RLC) policy for fixed caching policy. We formulate the cooperative caching problem as a multi-agent Markov decision process (MDP) with the goal of maximizing the long-term expected caching reward, which is NP-complete even when assuming users’ demand is perfectly known. To solve this problem, we propose C-MAAC, a partially decentralized multi-agent deep reinforcement learning (MADRL)-based collaborative caching algorithm employing actor-critic learning model. C-MAAC has a key characteristic of centralized training and decentralized execution, with which the challenge from unstable training process caused by simultaneous decision made by all agents can be addressed. Furthermore, we develop an optimization method as a criterion for our MADRL framework when assuming the content popularity is stationary and prior known. Our experimental results demonstrate that compared with the prior art, C-MAAC increases an average of 21.7% caching reward in dynamic environment when user request traffic changes rapidly.
期刊介绍:
The IEEE Transactions on Cognitive Communications and Networking (TCCN) aims to publish high-quality manuscripts that push the boundaries of cognitive communications and networking research. Cognitive, in this context, refers to the application of perception, learning, reasoning, memory, and adaptive approaches in communication system design. The transactions welcome submissions that explore various aspects of cognitive communications and networks, focusing on innovative and holistic approaches to complex system design. Key topics covered include architecture, protocols, cross-layer design, and cognition cycle design for cognitive networks. Additionally, research on machine learning, artificial intelligence, end-to-end and distributed intelligence, software-defined networking, cognitive radios, spectrum sharing, and security and privacy issues in cognitive networks are of interest. The publication also encourages papers addressing novel services and applications enabled by these cognitive concepts.