On Learning Generalized Wireless MAC Communication Protocols via a Feasible Multi-Agent Reinforcement Learning Framework

IEEE Transactions on Machine Learning in Communications and Networking Pub Date : 2024-02-20 DOI:10.1109/TMLCN.2024.3368367

Luciano Miuccio;Salvatore Riolo;Sumudu Samarakoon;Mehdi Bennis;Daniela Panno

{"title":"On Learning Generalized Wireless MAC Communication Protocols via a Feasible Multi-Agent Reinforcement Learning Framework","authors":"Luciano Miuccio;Salvatore Riolo;Sumudu Samarakoon;Mehdi Bennis;Daniela Panno","doi":"10.1109/TMLCN.2024.3368367","DOIUrl":null,"url":null,"abstract":"Automatically learning medium access control (MAC) communication protocols via multi-agent reinforcement learning (MARL) has received huge attention to cater to the extremely diverse real-world scenarios expected in 6G wireless networks. Several state-of-the-art solutions adopt the centralized training with decentralized execution (CTDE) learning method, where agents learn optimal MAC protocols by exploiting the information exchanged with a central unit. Despite the promising results achieved in these works, two notable challenges are neglected. First, these works were designed to be trained in computer simulations assuming an omniscient environment and neglecting communication overhead issues, thus making the implementation impractical in real-world scenarios. Second, the learned protocols fail to generalize outside of the scenario they were trained on. In this paper, we propose a new feasible learning framework that enables practical implementations of training procedures, thus allowing learned MAC protocols to be tailor-made for the scenario where they will be executed. Moreover, to address the second challenge, we leverage the concept of state abstraction and imbue it into the MARL framework for better generalization. As a result, the policies are learned in an abstracted observation space that contains only useful information extracted from the original high-dimensional and redundant observation space. Simulation results show that our feasible learning framework exhibits performance comparable to that of the infeasible solutions. In addition, the learning frameworks adopting observation abstraction offer better generalization capabilities, in terms of the number of UEs, number of data packets to transmit, and channel conditions.","PeriodicalId":100641,"journal":{"name":"IEEE Transactions on Machine Learning in Communications and Networking","volume":"2 ","pages":"298-317"},"PeriodicalIF":0.0000,"publicationDate":"2024-02-20","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://ieeexplore.ieee.org/stamp/stamp.jsp?tp=&arnumber=10440615","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"IEEE Transactions on Machine Learning in Communications and Networking","FirstCategoryId":"1085","ListUrlMain":"https://ieeexplore.ieee.org/document/10440615/","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}

引用次数: 0

Abstract

Automatically learning medium access control (MAC) communication protocols via multi-agent reinforcement learning (MARL) has received huge attention to cater to the extremely diverse real-world scenarios expected in 6G wireless networks. Several state-of-the-art solutions adopt the centralized training with decentralized execution (CTDE) learning method, where agents learn optimal MAC protocols by exploiting the information exchanged with a central unit. Despite the promising results achieved in these works, two notable challenges are neglected. First, these works were designed to be trained in computer simulations assuming an omniscient environment and neglecting communication overhead issues, thus making the implementation impractical in real-world scenarios. Second, the learned protocols fail to generalize outside of the scenario they were trained on. In this paper, we propose a new feasible learning framework that enables practical implementations of training procedures, thus allowing learned MAC protocols to be tailor-made for the scenario where they will be executed. Moreover, to address the second challenge, we leverage the concept of state abstraction and imbue it into the MARL framework for better generalization. As a result, the policies are learned in an abstracted observation space that contains only useful information extracted from the original high-dimensional and redundant observation space. Simulation results show that our feasible learning framework exhibits performance comparable to that of the infeasible solutions. In addition, the learning frameworks adopting observation abstraction offer better generalization capabilities, in terms of the number of UEs, number of data packets to transmit, and channel conditions.

查看原文本刊更多论文

论通过可行的多代理强化学习框架学习通用无线 MAC 通信协议

通过多代理强化学习（MARL）自动学习介质访问控制（MAC）通信协议，以应对 6G 无线网络中极其多样化的实际应用场景，受到了广泛关注。一些最先进的解决方案采用了集中训练与分散执行（CTDE）学习方法，即代理通过利用与中央单元交换的信息来学习最佳 MAC 协议。尽管这些工作取得了可喜的成果，但有两个值得注意的挑战却被忽视了。首先，这些工作都是在计算机模拟中进行训练的，假设环境是全知的，并忽略了通信开销问题，因此在现实世界中的实施并不可行。其次，学习到的协议在训练场景之外无法通用。在本文中，我们提出了一种新的可行的学习框架，它能使训练程序切实可行，从而使学习到的 MAC 协议能为将要执行的场景量身定制。此外，为了应对第二个挑战，我们利用了状态抽象的概念，并将其融入 MARL 框架，以实现更好的泛化。因此，策略是在抽象观察空间中学习的，该空间只包含从原始高维冗余观察空间中提取的有用信息。仿真结果表明，我们的可行学习框架表现出与不可行解决方案相当的性能。此外，在 UE 数量、要传输的数据包数量和信道条件方面，采用观测抽象的学习框架具有更好的泛化能力。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

求助全文

约1分钟内获得全文求助全文

来源期刊

IEEE Transactions on Machine Learning in Communications and Networking

自引率

0.00%

发文量