Tao Mao;Junlong Zhu;Mingchuan Zhang;Quanbo Ge;Ruijuan Zheng;Qingtao Wu
{"title":"一种分散的熵正则化actor - critical算法及其有限时间分析","authors":"Tao Mao;Junlong Zhu;Mingchuan Zhang;Quanbo Ge;Ruijuan Zheng;Qingtao Wu","doi":"10.1109/TNNLS.2025.3573801","DOIUrl":null,"url":null,"abstract":"Decentralized actor-critic (AC) is one of the most dominant algorithms for dealing with multiagent reinforcement learning (MARL) problems. However, <italic>exploration-efficient</i>, <italic>sample-efficient</i>, and <italic>communication-efficient</i> are difficult to achieve simultaneously by existing decentralized AC methods. For this reason, this article develops a decentralized multiagent AC algorithm by incorporating entropy regularization to improve exploration with theoretical guarantees, referred to as multi-agent AC algorithm with entropy regularization (MACE). Moreover, we rigorously prove that MACE can achieve sample complexity <inline-formula> <tex-math>$\\mathcal {O}(\\epsilon ^{-2}\\ln \\epsilon ^{-1})$ </tex-math></inline-formula> and communication complexity of <inline-formula> <tex-math>$\\mathcal {O}(\\epsilon ^{-1}\\ln \\epsilon ^{-1})$ </tex-math></inline-formula>, which match the best complexities at present. Finally, the performance of MACE is also evaluated on reinforcement learning (RL) tasks. The experimental results show that the proposed algorithm achieves better exploration efficiency than state-of-the-art decentralized AC-type algorithms.","PeriodicalId":13303,"journal":{"name":"IEEE transactions on neural networks and learning systems","volume":"36 10","pages":"19423-19436"},"PeriodicalIF":8.9000,"publicationDate":"2025-06-10","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":"{\"title\":\"A Decentralized Actor–Critic Algorithm With Entropy Regularization and Its Finite-Time Analysis\",\"authors\":\"Tao Mao;Junlong Zhu;Mingchuan Zhang;Quanbo Ge;Ruijuan Zheng;Qingtao Wu\",\"doi\":\"10.1109/TNNLS.2025.3573801\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"Decentralized actor-critic (AC) is one of the most dominant algorithms for dealing with multiagent reinforcement learning (MARL) problems. However, <italic>exploration-efficient</i>, <italic>sample-efficient</i>, and <italic>communication-efficient</i> are difficult to achieve simultaneously by existing decentralized AC methods. For this reason, this article develops a decentralized multiagent AC algorithm by incorporating entropy regularization to improve exploration with theoretical guarantees, referred to as multi-agent AC algorithm with entropy regularization (MACE). Moreover, we rigorously prove that MACE can achieve sample complexity <inline-formula> <tex-math>$\\\\mathcal {O}(\\\\epsilon ^{-2}\\\\ln \\\\epsilon ^{-1})$ </tex-math></inline-formula> and communication complexity of <inline-formula> <tex-math>$\\\\mathcal {O}(\\\\epsilon ^{-1}\\\\ln \\\\epsilon ^{-1})$ </tex-math></inline-formula>, which match the best complexities at present. Finally, the performance of MACE is also evaluated on reinforcement learning (RL) tasks. The experimental results show that the proposed algorithm achieves better exploration efficiency than state-of-the-art decentralized AC-type algorithms.\",\"PeriodicalId\":13303,\"journal\":{\"name\":\"IEEE transactions on neural networks and learning systems\",\"volume\":\"36 10\",\"pages\":\"19423-19436\"},\"PeriodicalIF\":8.9000,\"publicationDate\":\"2025-06-10\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"0\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"IEEE transactions on neural networks and learning systems\",\"FirstCategoryId\":\"94\",\"ListUrlMain\":\"https://ieeexplore.ieee.org/document/11029594/\",\"RegionNum\":1,\"RegionCategory\":\"计算机科学\",\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"Q1\",\"JCRName\":\"COMPUTER SCIENCE, ARTIFICIAL INTELLIGENCE\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"IEEE transactions on neural networks and learning systems","FirstCategoryId":"94","ListUrlMain":"https://ieeexplore.ieee.org/document/11029594/","RegionNum":1,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q1","JCRName":"COMPUTER SCIENCE, ARTIFICIAL INTELLIGENCE","Score":null,"Total":0}
A Decentralized Actor–Critic Algorithm With Entropy Regularization and Its Finite-Time Analysis
Decentralized actor-critic (AC) is one of the most dominant algorithms for dealing with multiagent reinforcement learning (MARL) problems. However, exploration-efficient, sample-efficient, and communication-efficient are difficult to achieve simultaneously by existing decentralized AC methods. For this reason, this article develops a decentralized multiagent AC algorithm by incorporating entropy regularization to improve exploration with theoretical guarantees, referred to as multi-agent AC algorithm with entropy regularization (MACE). Moreover, we rigorously prove that MACE can achieve sample complexity $\mathcal {O}(\epsilon ^{-2}\ln \epsilon ^{-1})$ and communication complexity of $\mathcal {O}(\epsilon ^{-1}\ln \epsilon ^{-1})$ , which match the best complexities at present. Finally, the performance of MACE is also evaluated on reinforcement learning (RL) tasks. The experimental results show that the proposed algorithm achieves better exploration efficiency than state-of-the-art decentralized AC-type algorithms.
期刊介绍:
The focus of IEEE Transactions on Neural Networks and Learning Systems is to present scholarly articles discussing the theory, design, and applications of neural networks as well as other learning systems. The journal primarily highlights technical and scientific research in this domain.