Age of Semantics in Cooperative Communications: To Expedite Simulation Towards Real via Offline Reinforcement Learning

IF 6.7 2区 计算机科学 Q1 ENGINEERING, MULTIDISCIPLINARY
Xianfu Chen;Zhifeng Zhao;Shiwen Mao;Celimuge Wu;Honggang Zhang;Mehdi Bennis
{"title":"Age of Semantics in Cooperative Communications: To Expedite Simulation Towards Real via Offline Reinforcement Learning","authors":"Xianfu Chen;Zhifeng Zhao;Shiwen Mao;Celimuge Wu;Honggang Zhang;Mehdi Bennis","doi":"10.1109/TNSE.2025.3558747","DOIUrl":null,"url":null,"abstract":"The age of information metric fails to correctly describe the intrinsic semantics of a status update. In an intelligent reflecting surface-aided cooperative relay communication system, we propose the age of semantics (AoS) for measuring the semantics freshness of status updates. Specifically, we focus on status updates from a source node (SN) to the destination, which is formulated as a Markov decision process. The objective of the SN is to maximize the expected satisfaction of AoS and energy consumption under the maximum transmit power constraint. To seek the optimal control policy, we first derive an online deep actor-critic (DAC) learning scheme under the on-policy temporal difference learning framework. However, implementing the online DAC in practice poses key challenge in infinitely repeated interactions between the SN and the system, which can be dangerous particularly during exploration. We then put forward a novel offline DAC scheme, which estimates the optimal control policy from a previously collected dataset without any further interactions with the system. Numerical experiments verify the theoretical results and show that our offline DAC scheme significantly outperforms the online DAC scheme and the most representative baselines in terms of mean utility, demonstrating strong robustness to dataset quality.","PeriodicalId":54229,"journal":{"name":"IEEE Transactions on Network Science and Engineering","volume":"12 4","pages":"3244-3258"},"PeriodicalIF":6.7000,"publicationDate":"2025-04-08","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"IEEE Transactions on Network Science and Engineering","FirstCategoryId":"94","ListUrlMain":"https://ieeexplore.ieee.org/document/10955408/","RegionNum":2,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q1","JCRName":"ENGINEERING, MULTIDISCIPLINARY","Score":null,"Total":0}
引用次数: 0

Abstract

The age of information metric fails to correctly describe the intrinsic semantics of a status update. In an intelligent reflecting surface-aided cooperative relay communication system, we propose the age of semantics (AoS) for measuring the semantics freshness of status updates. Specifically, we focus on status updates from a source node (SN) to the destination, which is formulated as a Markov decision process. The objective of the SN is to maximize the expected satisfaction of AoS and energy consumption under the maximum transmit power constraint. To seek the optimal control policy, we first derive an online deep actor-critic (DAC) learning scheme under the on-policy temporal difference learning framework. However, implementing the online DAC in practice poses key challenge in infinitely repeated interactions between the SN and the system, which can be dangerous particularly during exploration. We then put forward a novel offline DAC scheme, which estimates the optimal control policy from a previously collected dataset without any further interactions with the system. Numerical experiments verify the theoretical results and show that our offline DAC scheme significantly outperforms the online DAC scheme and the most representative baselines in terms of mean utility, demonstrating strong robustness to dataset quality.
协作通信中的语义时代:通过离线强化学习加速模拟走向现实
信息度量时代无法正确描述状态更新的内在语义。在智能反射表面辅助协同中继通信系统中,我们提出了语义年龄(age of semantics, AoS)来度量状态更新的语义新鲜度。具体来说,我们关注从源节点到目标节点的状态更新,将其表述为马尔可夫决策过程。SN的目标是在最大发射功率约束下使AoS的预期满意度和能量消耗最大化。为了寻求最优控制策略,我们首先在非策略时间差异学习框架下推导了一种在线深度行为者-评论家(DAC)学习方案。然而,在实践中实现在线DAC对SN和系统之间的无限重复交互提出了关键挑战,特别是在探索过程中,这可能是危险的。然后,我们提出了一种新的离线DAC方案,该方案从先前收集的数据集估计最优控制策略,而无需与系统进一步交互。数值实验验证了理论结果,表明我们的离线DAC方案在平均效用方面显著优于在线DAC方案和最具代表性的基线,对数据集质量表现出较强的鲁棒性。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
求助全文
约1分钟内获得全文 求助全文
来源期刊
IEEE Transactions on Network Science and Engineering
IEEE Transactions on Network Science and Engineering Engineering-Control and Systems Engineering
CiteScore
12.60
自引率
9.10%
发文量
393
期刊介绍: The proposed journal, called the IEEE Transactions on Network Science and Engineering (TNSE), is committed to timely publishing of peer-reviewed technical articles that deal with the theory and applications of network science and the interconnections among the elements in a system that form a network. In particular, the IEEE Transactions on Network Science and Engineering publishes articles on understanding, prediction, and control of structures and behaviors of networks at the fundamental level. The types of networks covered include physical or engineered networks, information networks, biological networks, semantic networks, economic networks, social networks, and ecological networks. Aimed at discovering common principles that govern network structures, network functionalities and behaviors of networks, the journal seeks articles on understanding, prediction, and control of structures and behaviors of networks. Another trans-disciplinary focus of the IEEE Transactions on Network Science and Engineering is the interactions between and co-evolution of different genres of networks.
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
确定
请完成安全验证×
copy
已复制链接
快去分享给好友吧!
我知道了
右上角分享
点击右上角分享
0
联系我们:info@booksci.cn Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。 Copyright © 2023 布克学术 All rights reserved.
京ICP备2023020795号-1
ghs 京公网安备 11010802042870号
Book学术文献互助
Book学术文献互助群
群 号:604180095
Book学术官方微信