下载PDF
{"title":"基于大语言模型的奖励设计用于强化学习智能体行为的自然语言解释","authors":"Shinya Masadome, Taku Harada","doi":"10.1002/tee.70005","DOIUrl":null,"url":null,"abstract":"<p>Reinforcement learning (RL) has found applications across diverse domains; however, it grapples with challenges when formulating reward functions and exhibits low exploration efficiency. Recent studies leveraging large language models (LLMs) have made strides in addressing these issues. However, for RL agents to be practically deployable, elucidating their decision-making process is crucial for enhancing explainability. We introduce a novel RL approach aimed at alleviating the burden of designing reward functions and facilitating natural language explanations for actions grounded in the agent's decisions. Our method employs two types of agents: a low-level agent responsible for concrete action selection and a high-level agent tasked with setting abstract action goals. The high-level agent undergoes training using a hybrid reward function framework, which incentivizes its actions by comparing them with those generated by an LLM across discretized states. Meanwhile, the training of the low-level agent is guided by a reward function designed using the EUREKA algorithm. We applied the proposed method to the cart-pole problem and demonstrated its ability to achieve a learning convergence rate while reducing human effort. Moreover, our approach yields coherent natural language explanations elucidating the rationale behind the agent's actions. © 2025 The Author(s). <i>IEEJ Transactions on Electrical and Electronic Engineering</i> published by Institute of Electrical Engineers of Japan and Wiley Periodicals LLC.</p>","PeriodicalId":13435,"journal":{"name":"IEEJ Transactions on Electrical and Electronic Engineering","volume":"20 8","pages":"1203-1211"},"PeriodicalIF":1.1000,"publicationDate":"2025-03-06","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://onlinelibrary.wiley.com/doi/epdf/10.1002/tee.70005","citationCount":"0","resultStr":"{\"title\":\"Reward Design Using Large Language Models for Natural Language Explanation of Reinforcement Learning Agent Actions\",\"authors\":\"Shinya Masadome, Taku Harada\",\"doi\":\"10.1002/tee.70005\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"<p>Reinforcement learning (RL) has found applications across diverse domains; however, it grapples with challenges when formulating reward functions and exhibits low exploration efficiency. Recent studies leveraging large language models (LLMs) have made strides in addressing these issues. However, for RL agents to be practically deployable, elucidating their decision-making process is crucial for enhancing explainability. We introduce a novel RL approach aimed at alleviating the burden of designing reward functions and facilitating natural language explanations for actions grounded in the agent's decisions. Our method employs two types of agents: a low-level agent responsible for concrete action selection and a high-level agent tasked with setting abstract action goals. The high-level agent undergoes training using a hybrid reward function framework, which incentivizes its actions by comparing them with those generated by an LLM across discretized states. Meanwhile, the training of the low-level agent is guided by a reward function designed using the EUREKA algorithm. We applied the proposed method to the cart-pole problem and demonstrated its ability to achieve a learning convergence rate while reducing human effort. Moreover, our approach yields coherent natural language explanations elucidating the rationale behind the agent's actions. © 2025 The Author(s). <i>IEEJ Transactions on Electrical and Electronic Engineering</i> published by Institute of Electrical Engineers of Japan and Wiley Periodicals LLC.</p>\",\"PeriodicalId\":13435,\"journal\":{\"name\":\"IEEJ Transactions on Electrical and Electronic Engineering\",\"volume\":\"20 8\",\"pages\":\"1203-1211\"},\"PeriodicalIF\":1.1000,\"publicationDate\":\"2025-03-06\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"https://onlinelibrary.wiley.com/doi/epdf/10.1002/tee.70005\",\"citationCount\":\"0\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"IEEJ Transactions on Electrical and Electronic Engineering\",\"FirstCategoryId\":\"5\",\"ListUrlMain\":\"https://onlinelibrary.wiley.com/doi/10.1002/tee.70005\",\"RegionNum\":4,\"RegionCategory\":\"工程技术\",\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"Q4\",\"JCRName\":\"ENGINEERING, ELECTRICAL & ELECTRONIC\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"IEEJ Transactions on Electrical and Electronic Engineering","FirstCategoryId":"5","ListUrlMain":"https://onlinelibrary.wiley.com/doi/10.1002/tee.70005","RegionNum":4,"RegionCategory":"工程技术","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q4","JCRName":"ENGINEERING, ELECTRICAL & ELECTRONIC","Score":null,"Total":0}
引用次数: 0
引用
批量引用