Reinforcement Learning Based Online Algorithm for Near-Field Time-Varying IRS Phase Shift Optimization: System Evolution Perspective

IF 4.6 2区工程技术 Q1 ENGINEERING, ELECTRICAL & ELECTRONIC

IEEE Transactions on Signal Processing Pub Date : 2025-02-24 DOI:10.1109/TSP.2025.3545164

Zongtai Li;Rui Wang;Erwu Liu

{"title":"Reinforcement Learning Based Online Algorithm for Near-Field Time-Varying IRS Phase Shift Optimization: System Evolution Perspective","authors":"Zongtai Li;Rui Wang;Erwu Liu","doi":"10.1109/TSP.2025.3545164","DOIUrl":null,"url":null,"abstract":"This paper proposes a reinforcement learning (RL) based intelligent reflecting surface (IRS) incremental control algorithm for a mmWave time-varying multi-user multiple-input single-output (MU-MISO) system. The research focuses on addressing the key challenge of near-field IRS design, which involves time-varying channels due to users’ mobility. In practice, the optimization becomes more challenging when the components of the concatenated channel are unknown. From a higher perspective, we leverage electromagnetic information theory and manifold theory to provide a unified description of the IRS-assisted MU-MISO system. We regard the communication system as a nonlinear dynamic system on reproducing kernel Hilbert space (RKHS), upon which the approximate evolution operator is defined as observables for system evolution. The IRS phase shift optimization problem is modeled as a nonlinear system eigenvalue maximization problem. Utilizing the geometric properties of the unitary evolution operator, we define a metric space where the geodesic-based distance function satisfies the Lipschitz condition, enabling efficient exploitation of channel similarities. We transform the complex non-convex optimization problem into a low-dimensional linear contextual bandit problem. The performance of the proposed GLinUCB algorithm is evaluated through numerical simulations in various scenarios, showing its effectiveness in achieving high sum rates with fast convergence speed.","PeriodicalId":13330,"journal":{"name":"IEEE Transactions on Signal Processing","volume":"73 ","pages":"1231-1245"},"PeriodicalIF":4.6000,"publicationDate":"2025-02-24","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"IEEE Transactions on Signal Processing","FirstCategoryId":"5","ListUrlMain":"https://ieeexplore.ieee.org/document/10902006/","RegionNum":2,"RegionCategory":"工程技术","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q1","JCRName":"ENGINEERING, ELECTRICAL & ELECTRONIC","Score":null,"Total":0}

引用次数: 0

Abstract

This paper proposes a reinforcement learning (RL) based intelligent reflecting surface (IRS) incremental control algorithm for a mmWave time-varying multi-user multiple-input single-output (MU-MISO) system. The research focuses on addressing the key challenge of near-field IRS design, which involves time-varying channels due to users’ mobility. In practice, the optimization becomes more challenging when the components of the concatenated channel are unknown. From a higher perspective, we leverage electromagnetic information theory and manifold theory to provide a unified description of the IRS-assisted MU-MISO system. We regard the communication system as a nonlinear dynamic system on reproducing kernel Hilbert space (RKHS), upon which the approximate evolution operator is defined as observables for system evolution. The IRS phase shift optimization problem is modeled as a nonlinear system eigenvalue maximization problem. Utilizing the geometric properties of the unitary evolution operator, we define a metric space where the geodesic-based distance function satisfies the Lipschitz condition, enabling efficient exploitation of channel similarities. We transform the complex non-convex optimization problem into a low-dimensional linear contextual bandit problem. The performance of the proposed GLinUCB algorithm is evaluated through numerical simulations in various scenarios, showing its effectiveness in achieving high sum rates with fast convergence speed.

查看原文本刊更多论文

基于强化学习的近场时变IRS相移优化在线算法：系统演化视角

针对毫米波时变多用户多输入单输出（MU-MISO）系统，提出一种基于强化学习（RL）的智能反射面（IRS）增量控制算法。研究重点是解决近场IRS设计的关键挑战，该挑战涉及由于用户移动性而引起的时变信道。在实践中，当连接通道的组件未知时，优化变得更具挑战性。从更高的角度出发，我们利用电磁信息论和流形理论对irs辅助MU-MISO系统进行了统一描述。我们把通信系统看作是一个可再生核希尔伯特空间（RKHS）上的非线性动态系统，在此基础上定义近似演化算子为系统演化的可观测值。将IRS相移优化问题建模为非线性系统特征值最大化问题。利用酉演化算子的几何性质，我们定义了一个度量空间，其中基于测地的距离函数满足Lipschitz条件，从而能够有效地利用信道相似性。我们将复杂的非凸优化问题转化为一个低维线性上下文强盗问题。通过各种场景下的数值模拟，对所提出的GLinUCB算法的性能进行了评估，证明了该算法在实现高和速率和快速收敛方面的有效性。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

求助全文

约1分钟内获得全文求助全文

来源期刊

IEEE Transactions on Signal Processing 工程技术-工程：电子与电气

CiteScore

11.20

自引率

9.30%

发文量

310

审稿时长

3.0 months

期刊介绍： The IEEE Transactions on Signal Processing covers novel theory, algorithms, performance analyses and applications of techniques for the processing, understanding, learning, retrieval, mining, and extraction of information from signals. The term “signal” includes, among others, audio, video, speech, image, communication, geophysical, sonar, radar, medical and musical signals. Examples of topics of interest include, but are not limited to, information processing and the theory and application of filtering, coding, transmitting, estimating, detecting, analyzing, recognizing, synthesizing, recording, and reproducing signals.