Localizing state space for visual reinforcement learning in noisy environments

IF 7.5 2区 计算机科学 Q1 AUTOMATION & CONTROL SYSTEMS
Jing Cheng , Jingchen Li , Haobin Shi , Tao Zhang
{"title":"Localizing state space for visual reinforcement learning in noisy environments","authors":"Jing Cheng ,&nbsp;Jingchen Li ,&nbsp;Haobin Shi ,&nbsp;Tao Zhang","doi":"10.1016/j.engappai.2025.110998","DOIUrl":null,"url":null,"abstract":"<div><div>Gaining robust policies is what the visual reinforcement learning community desires. In practical application, the noises in an environment lead to a larger variance in the perception of a reinforcement learning agent. This work introduces a non-differential module into deep reinforcement learning to localize the state space for agents, by which the impact of noises can be greatly reduced, and the learned policy can be explained implicitly. The proposed model leverages a hard attention module for localization, while an additional reinforcement learning process is built to update the localization module. We analyze the relationship between the non-differential module and agent, regarding the whole training as a hierarchical multi-agent reinforcement learning model, ensuring the convergence of policies by centralized evaluation. Moreover, to couple the localization policy and behavior policy, we modify the evaluation processes, gaining more direct coordination for them. The proposed method enables the agent to localize its observation or state in an explainable way, learning more advanced and robust policies by ignoring irrelevant data or changes in noisy environments. That is, it enhances reinforcement learning’s ability to disturbance rejection. Several experiments on simulation environments and Robot Arm suggest our localization module can be embedded into existing reinforcement learning models to enhance them in many respects.</div></div>","PeriodicalId":50523,"journal":{"name":"Engineering Applications of Artificial Intelligence","volume":"156 ","pages":"Article 110998"},"PeriodicalIF":7.5000,"publicationDate":"2025-05-17","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Engineering Applications of Artificial Intelligence","FirstCategoryId":"94","ListUrlMain":"https://www.sciencedirect.com/science/article/pii/S0952197625009984","RegionNum":2,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q1","JCRName":"AUTOMATION & CONTROL SYSTEMS","Score":null,"Total":0}
引用次数: 0

Abstract

Gaining robust policies is what the visual reinforcement learning community desires. In practical application, the noises in an environment lead to a larger variance in the perception of a reinforcement learning agent. This work introduces a non-differential module into deep reinforcement learning to localize the state space for agents, by which the impact of noises can be greatly reduced, and the learned policy can be explained implicitly. The proposed model leverages a hard attention module for localization, while an additional reinforcement learning process is built to update the localization module. We analyze the relationship between the non-differential module and agent, regarding the whole training as a hierarchical multi-agent reinforcement learning model, ensuring the convergence of policies by centralized evaluation. Moreover, to couple the localization policy and behavior policy, we modify the evaluation processes, gaining more direct coordination for them. The proposed method enables the agent to localize its observation or state in an explainable way, learning more advanced and robust policies by ignoring irrelevant data or changes in noisy environments. That is, it enhances reinforcement learning’s ability to disturbance rejection. Several experiments on simulation environments and Robot Arm suggest our localization module can be embedded into existing reinforcement learning models to enhance them in many respects.
噪声环境下视觉强化学习的状态空间定位
获得健壮的策略是视觉强化学习社区所希望的。在实际应用中,环境中的噪声会导致强化学习智能体感知的较大差异。在深度强化学习中引入非差分模块对智能体的状态空间进行局部化,大大降低了噪声的影响,并且可以隐式地解释学习到的策略。该模型利用硬注意模块进行定位,同时建立了一个额外的强化学习过程来更新定位模块。我们分析了非差分模块和智能体之间的关系,将整个训练过程视为一个分层的多智能体强化学习模型,通过集中评估来保证策略的收敛性。此外,为了将本地化策略和行为策略结合起来,我们修改了评估过程,使它们更直接地协调。提出的方法使智能体能够以一种可解释的方式定位其观察或状态,通过忽略不相关的数据或嘈杂环境中的变化来学习更高级和更健壮的策略。也就是说,它增强了强化学习对干扰的抑制能力。在仿真环境和Robot Arm上的几个实验表明,我们的定位模块可以嵌入到现有的强化学习模型中,以在许多方面增强它们。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
求助全文
约1分钟内获得全文 求助全文
来源期刊
Engineering Applications of Artificial Intelligence
Engineering Applications of Artificial Intelligence 工程技术-工程:电子与电气
CiteScore
9.60
自引率
10.00%
发文量
505
审稿时长
68 days
期刊介绍: Artificial Intelligence (AI) is pivotal in driving the fourth industrial revolution, witnessing remarkable advancements across various machine learning methodologies. AI techniques have become indispensable tools for practicing engineers, enabling them to tackle previously insurmountable challenges. Engineering Applications of Artificial Intelligence serves as a global platform for the swift dissemination of research elucidating the practical application of AI methods across all engineering disciplines. Submitted papers are expected to present novel aspects of AI utilized in real-world engineering applications, validated using publicly available datasets to ensure the replicability of research outcomes. Join us in exploring the transformative potential of AI in engineering.
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
确定
请完成安全验证×
copy
已复制链接
快去分享给好友吧!
我知道了
右上角分享
点击右上角分享
0
联系我们:info@booksci.cn Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。 Copyright © 2023 布克学术 All rights reserved.
京ICP备2023020795号-1
ghs 京公网安备 11010802042870号
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术官方微信