联合通信与感知的因果感知强化学习

IEEE Transactions on Machine Learning in Communications and Networking Pub Date : 2025-04-21 DOI:10.1109/TMLCN.2025.3562557

Anik Roy;Serene Banerjee;Jishnu Sadasivan;Arnab Sarkar;Soumyajit Dey

{"title":"联合通信与感知的因果感知强化学习","authors":"Anik Roy;Serene Banerjee;Jishnu Sadasivan;Arnab Sarkar;Soumyajit Dey","doi":"10.1109/TMLCN.2025.3562557","DOIUrl":null,"url":null,"abstract":"The next-generation wireless network, 6G and beyond, envisions to integrate communication and sensing to overcome interference, improve spectrum efficiency, and reduce hardware and power consumption. Massive Multiple-Input Multiple Output (mMIMO)-based Joint Communication and Sensing (JCAS) systems realize this integration for 6G applications such as autonomous driving, as it requires accurate environmental sensing and time-critical communication with neighbouring vehicles. Reinforcement Learning (RL) is used for mMIMO antenna beamforming in the existing literature. However, the huge search space for actions associated with antenna beamforming causes the learning process for the RL agent to be inefficient due to high beam training overhead. The learning process does not consider the causal relationship between action space and the reward, and gives all actions equal importance. In this work, we explore a causally-aware RL agent which can intervene and discover causal relationships for mMIMO-based JCAS environments, during the training phase. We use a state dependent action dimension selection strategy to realize causal discovery for RL-based JCAS. Evaluation of the causally-aware RL framework in different JCAS scenarios shows the benefit of our proposed solution over baseline methods in terms of the higher reward. We have shown that in the presence of interfering users and sensing signal clutters, our proposed solution achieves 30% higher data rate in comparison to the communication-only state-of-the-art beam pattern learning method while maintaining sensing performance.","PeriodicalId":100641,"journal":{"name":"IEEE Transactions on Machine Learning in Communications and Networking","volume":"3 ","pages":"552-567"},"PeriodicalIF":0.0000,"publicationDate":"2025-04-21","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://ieeexplore.ieee.org/stamp/stamp.jsp?tp=&arnumber=10971373","citationCount":"0","resultStr":"{\"title\":\"Causally-Aware Reinforcement Learning for Joint Communication and Sensing\",\"authors\":\"Anik Roy;Serene Banerjee;Jishnu Sadasivan;Arnab Sarkar;Soumyajit Dey\",\"doi\":\"10.1109/TMLCN.2025.3562557\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"The next-generation wireless network, 6G and beyond, envisions to integrate communication and sensing to overcome interference, improve spectrum efficiency, and reduce hardware and power consumption. Massive Multiple-Input Multiple Output (mMIMO)-based Joint Communication and Sensing (JCAS) systems realize this integration for 6G applications such as autonomous driving, as it requires accurate environmental sensing and time-critical communication with neighbouring vehicles. Reinforcement Learning (RL) is used for mMIMO antenna beamforming in the existing literature. However, the huge search space for actions associated with antenna beamforming causes the learning process for the RL agent to be inefficient due to high beam training overhead. The learning process does not consider the causal relationship between action space and the reward, and gives all actions equal importance. In this work, we explore a causally-aware RL agent which can intervene and discover causal relationships for mMIMO-based JCAS environments, during the training phase. We use a state dependent action dimension selection strategy to realize causal discovery for RL-based JCAS. Evaluation of the causally-aware RL framework in different JCAS scenarios shows the benefit of our proposed solution over baseline methods in terms of the higher reward. We have shown that in the presence of interfering users and sensing signal clutters, our proposed solution achieves 30% higher data rate in comparison to the communication-only state-of-the-art beam pattern learning method while maintaining sensing performance.\",\"PeriodicalId\":100641,\"journal\":{\"name\":\"IEEE Transactions on Machine Learning in Communications and Networking\",\"volume\":\"3 \",\"pages\":\"552-567\"},\"PeriodicalIF\":0.0000,\"publicationDate\":\"2025-04-21\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"https://ieeexplore.ieee.org/stamp/stamp.jsp?tp=&arnumber=10971373\",\"citationCount\":\"0\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"IEEE Transactions on Machine Learning in Communications and Networking\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://ieeexplore.ieee.org/document/10971373/\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"IEEE Transactions on Machine Learning in Communications and Networking","FirstCategoryId":"1085","ListUrlMain":"https://ieeexplore.ieee.org/document/10971373/","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}

引用次数: 0

摘要

下一代无线网络（6G及以上）设想将通信和传感集成在一起，以克服干扰，提高频谱效率，降低硬件和功耗。基于大规模多输入多输出（mMIMO）的联合通信与传感（JCAS）系统为自动驾驶等6G应用实现了这种集成，因为它需要精确的环境传感和与邻近车辆的时间关键通信。现有文献将强化学习（RL）用于mimo天线波束形成。然而，与天线波束形成相关的动作的巨大搜索空间导致RL代理的学习过程由于高波束训练开销而效率低下。学习过程不考虑行动空间和奖励之间的因果关系，并给予所有行动同等的重要性。在这项工作中，我们探索了一个因果感知的强化学习代理，它可以在训练阶段干预并发现基于mimo的JCAS环境的因果关系。我们使用状态依赖的动作维度选择策略来实现基于rl的JCAS的因果发现。在不同的JCAS场景中对因果感知RL框架的评估表明，就更高的回报而言，我们提出的解决方案优于基线方法。我们已经证明，在存在干扰用户和传感信号杂波的情况下，与仅通信的最先进波束模式学习方法相比，我们提出的解决方案在保持传感性能的同时实现了30%的高数据速率。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

查看原文本刊更多论文

Causally-Aware Reinforcement Learning for Joint Communication and Sensing

The next-generation wireless network, 6G and beyond, envisions to integrate communication and sensing to overcome interference, improve spectrum efficiency, and reduce hardware and power consumption. Massive Multiple-Input Multiple Output (mMIMO)-based Joint Communication and Sensing (JCAS) systems realize this integration for 6G applications such as autonomous driving, as it requires accurate environmental sensing and time-critical communication with neighbouring vehicles. Reinforcement Learning (RL) is used for mMIMO antenna beamforming in the existing literature. However, the huge search space for actions associated with antenna beamforming causes the learning process for the RL agent to be inefficient due to high beam training overhead. The learning process does not consider the causal relationship between action space and the reward, and gives all actions equal importance. In this work, we explore a causally-aware RL agent which can intervene and discover causal relationships for mMIMO-based JCAS environments, during the training phase. We use a state dependent action dimension selection strategy to realize causal discovery for RL-based JCAS. Evaluation of the causally-aware RL framework in different JCAS scenarios shows the benefit of our proposed solution over baseline methods in terms of the higher reward. We have shown that in the presence of interfering users and sensing signal clutters, our proposed solution achieves 30% higher data rate in comparison to the communication-only state-of-the-art beam pattern learning method while maintaining sensing performance.

求助全文

通过发布文献求助，成功后即可免费获取论文全文。去求助

来源期刊

IEEE Transactions on Machine Learning in Communications and Networking

自引率

0.00%

发文量