{"title":"Entropy-based tuning approach for Q-learning in an unstructured environment","authors":"Yu-Jen Chen , Wei-Cheng Jiang","doi":"10.1016/j.robot.2025.104924","DOIUrl":null,"url":null,"abstract":"<div><div>In reinforcement learning applications, achieving a balance between exploration and exploitation is a crucial problem during the learning process. This study proposes an entropy-based tuning approach that uses the value different based exploration theory is proposed to solve this problem in an unstructured environment. In such an environment, a learning agent can manage its exploration rates in each state instead of using a constant rate for all states. Moreover, some obstacles may block the agent’s path to the destination. Accordingly, the proposed approach enables the agent to adaptively increase its exploration rates in some states undergoing transitions; thus, the agent is encouraged to explore in those states. This paper presents simulations of maze environments and the car parking problem to verify the proposed approach. The simulation results demonstrate that our approach enables the agent to adjust its policy quickly to adapt to changing environments.</div></div>","PeriodicalId":49592,"journal":{"name":"Robotics and Autonomous Systems","volume":"187 ","pages":"Article 104924"},"PeriodicalIF":4.3000,"publicationDate":"2025-01-31","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Robotics and Autonomous Systems","FirstCategoryId":"94","ListUrlMain":"https://www.sciencedirect.com/science/article/pii/S0921889025000107","RegionNum":2,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q1","JCRName":"AUTOMATION & CONTROL SYSTEMS","Score":null,"Total":0}
引用次数: 0
Abstract
In reinforcement learning applications, achieving a balance between exploration and exploitation is a crucial problem during the learning process. This study proposes an entropy-based tuning approach that uses the value different based exploration theory is proposed to solve this problem in an unstructured environment. In such an environment, a learning agent can manage its exploration rates in each state instead of using a constant rate for all states. Moreover, some obstacles may block the agent’s path to the destination. Accordingly, the proposed approach enables the agent to adaptively increase its exploration rates in some states undergoing transitions; thus, the agent is encouraged to explore in those states. This paper presents simulations of maze environments and the car parking problem to verify the proposed approach. The simulation results demonstrate that our approach enables the agent to adjust its policy quickly to adapt to changing environments.
期刊介绍:
Robotics and Autonomous Systems will carry articles describing fundamental developments in the field of robotics, with special emphasis on autonomous systems. An important goal of this journal is to extend the state of the art in both symbolic and sensory based robot control and learning in the context of autonomous systems.
Robotics and Autonomous Systems will carry articles on the theoretical, computational and experimental aspects of autonomous systems, or modules of such systems.