{"title":"Self-Attention-Based Temporary Curiosity in Reinforcement Learning Exploration","authors":"Hangkai Hu, Shiji Song, Gao Huang","doi":"10.1109/TSMC.2019.2957051","DOIUrl":null,"url":null,"abstract":"In many real-world scenarios, extrinsic rewards provided by the environment are sparse. An agent trained with classic reinforcement learning algorithm fails to explore these environments in a sufficient and effective way. To address this problem, the exploration bonus which derives from environmental novelty serves as intrinsic motivation for the agent. In recent years, curiosity-driven exploration is a mainstream approach to describe environmental novelty through prediction errors of dynamics models. Due to the expressive ability limitations of curiosity-based environmental novelty and the difficulty of finding appropriate feature space, most curiosity-driven exploration methods have the problem of overprotection against repetition. This problem can reduce the efficiency of exploration and lead the agent into a trap with local optimality. In this article, we propose a combination of persisting curiosity and temporary curiosity framework to deal with the problem of overprotection against repetition. We introduce the self-attention mechanism from the field of computer vision and propose a sequence-based self-attention mechanism for temporary curiosity generation. We compare our framework with some previous exploration methods in hard-exploration environments, provide a series of comprehensive analysis of the proposed framework and investigate the effect of the individual components of our method. The experimental results indicate that the proposed framework delivers superior performance than existing methods.","PeriodicalId":55007,"journal":{"name":"IEEE Transactions on Systems Man and Cybernetics Part A-Systems and Humans","volume":"41 1","pages":"5773-5784"},"PeriodicalIF":0.0000,"publicationDate":"2021-09-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"5","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"IEEE Transactions on Systems Man and Cybernetics Part A-Systems and Humans","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/TSMC.2019.2957051","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 5
Abstract
In many real-world scenarios, extrinsic rewards provided by the environment are sparse. An agent trained with classic reinforcement learning algorithm fails to explore these environments in a sufficient and effective way. To address this problem, the exploration bonus which derives from environmental novelty serves as intrinsic motivation for the agent. In recent years, curiosity-driven exploration is a mainstream approach to describe environmental novelty through prediction errors of dynamics models. Due to the expressive ability limitations of curiosity-based environmental novelty and the difficulty of finding appropriate feature space, most curiosity-driven exploration methods have the problem of overprotection against repetition. This problem can reduce the efficiency of exploration and lead the agent into a trap with local optimality. In this article, we propose a combination of persisting curiosity and temporary curiosity framework to deal with the problem of overprotection against repetition. We introduce the self-attention mechanism from the field of computer vision and propose a sequence-based self-attention mechanism for temporary curiosity generation. We compare our framework with some previous exploration methods in hard-exploration environments, provide a series of comprehensive analysis of the proposed framework and investigate the effect of the individual components of our method. The experimental results indicate that the proposed framework delivers superior performance than existing methods.
期刊介绍:
The scope of the IEEE Transactions on Systems, Man, and Cybernetics: Systems includes the fields of systems engineering. It includes issue formulation, analysis and modeling, decision making, and issue interpretation for any of the systems engineering lifecycle phases associated with the definition, development, and deployment of large systems. In addition, it includes systems management, systems engineering processes, and a variety of systems engineering methods such as optimization, modeling and simulation.