Localizing state space for visual reinforcement learning in noisy environments

IF 7.5 2区计算机科学 Q1 AUTOMATION & CONTROL SYSTEMS

Engineering Applications of Artificial Intelligence Pub Date : 2025-05-17 DOI:10.1016/j.engappai.2025.110998

Jing Cheng , Jingchen Li , Haobin Shi , Tao Zhang

{"title":"Localizing state space for visual reinforcement learning in noisy environments","authors":"Jing Cheng , Jingchen Li , Haobin Shi , Tao Zhang","doi":"10.1016/j.engappai.2025.110998","DOIUrl":null,"url":null,"abstract":"<div><div>Gaining robust policies is what the visual reinforcement learning community desires. In practical application, the noises in an environment lead to a larger variance in the perception of a reinforcement learning agent. This work introduces a non-differential module into deep reinforcement learning to localize the state space for agents, by which the impact of noises can be greatly reduced, and the learned policy can be explained implicitly. The proposed model leverages a hard attention module for localization, while an additional reinforcement learning process is built to update the localization module. We analyze the relationship between the non-differential module and agent, regarding the whole training as a hierarchical multi-agent reinforcement learning model, ensuring the convergence of policies by centralized evaluation. Moreover, to couple the localization policy and behavior policy, we modify the evaluation processes, gaining more direct coordination for them. The proposed method enables the agent to localize its observation or state in an explainable way, learning more advanced and robust policies by ignoring irrelevant data or changes in noisy environments. That is, it enhances reinforcement learning’s ability to disturbance rejection. Several experiments on simulation environments and Robot Arm suggest our localization module can be embedded into existing reinforcement learning models to enhance them in many respects.</div></div>","PeriodicalId":50523,"journal":{"name":"Engineering Applications of Artificial Intelligence","volume":"156 ","pages":"Article 110998"},"PeriodicalIF":7.5000,"publicationDate":"2025-05-17","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Engineering Applications of Artificial Intelligence","FirstCategoryId":"94","ListUrlMain":"https://www.sciencedirect.com/science/article/pii/S0952197625009984","RegionNum":2,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q1","JCRName":"AUTOMATION & CONTROL SYSTEMS","Score":null,"Total":0}

引用次数: 0

Abstract

Gaining robust policies is what the visual reinforcement learning community desires. In practical application, the noises in an environment lead to a larger variance in the perception of a reinforcement learning agent. This work introduces a non-differential module into deep reinforcement learning to localize the state space for agents, by which the impact of noises can be greatly reduced, and the learned policy can be explained implicitly. The proposed model leverages a hard attention module for localization, while an additional reinforcement learning process is built to update the localization module. We analyze the relationship between the non-differential module and agent, regarding the whole training as a hierarchical multi-agent reinforcement learning model, ensuring the convergence of policies by centralized evaluation. Moreover, to couple the localization policy and behavior policy, we modify the evaluation processes, gaining more direct coordination for them. The proposed method enables the agent to localize its observation or state in an explainable way, learning more advanced and robust policies by ignoring irrelevant data or changes in noisy environments. That is, it enhances reinforcement learning’s ability to disturbance rejection. Several experiments on simulation environments and Robot Arm suggest our localization module can be embedded into existing reinforcement learning models to enhance them in many respects.

查看原文本刊更多论文

噪声环境下视觉强化学习的状态空间定位

获得健壮的策略是视觉强化学习社区所希望的。在实际应用中，环境中的噪声会导致强化学习智能体感知的较大差异。在深度强化学习中引入非差分模块对智能体的状态空间进行局部化，大大降低了噪声的影响，并且可以隐式地解释学习到的策略。该模型利用硬注意模块进行定位，同时建立了一个额外的强化学习过程来更新定位模块。我们分析了非差分模块和智能体之间的关系，将整个训练过程视为一个分层的多智能体强化学习模型，通过集中评估来保证策略的收敛性。此外，为了将本地化策略和行为策略结合起来，我们修改了评估过程，使它们更直接地协调。提出的方法使智能体能够以一种可解释的方式定位其观察或状态，通过忽略不相关的数据或嘈杂环境中的变化来学习更高级和更健壮的策略。也就是说，它增强了强化学习对干扰的抑制能力。在仿真环境和Robot Arm上的几个实验表明，我们的定位模块可以嵌入到现有的强化学习模型中，以在许多方面增强它们。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

求助全文

约1分钟内获得全文求助全文

来源期刊

Engineering Applications of Artificial Intelligence 工程技术-工程：电子与电气

CiteScore

9.60

自引率

10.00%

发文量

505

审稿时长

68 days

期刊介绍： Artificial Intelligence (AI) is pivotal in driving the fourth industrial revolution, witnessing remarkable advancements across various machine learning methodologies. AI techniques have become indispensable tools for practicing engineers, enabling them to tackle previously insurmountable challenges. Engineering Applications of Artificial Intelligence serves as a global platform for the swift dissemination of research elucidating the practical application of AI methods across all engineering disciplines. Submitted papers are expected to present novel aspects of AI utilized in real-world engineering applications, validated using publicly available datasets to ensure the replicability of research outcomes. Join us in exploring the transformative potential of AI in engineering.