Reward Design Using Large Language Models for Natural Language Explanation of Reinforcement Learning Agent Actions

IF 1.1 4区工程技术 Q4 ENGINEERING, ELECTRICAL & ELECTRONIC

IEEJ Transactions on Electrical and Electronic Engineering Pub Date : 2025-03-06 DOI:10.1002/tee.70005

Shinya Masadome, Taku Harada

{"title":"Reward Design Using Large Language Models for Natural Language Explanation of Reinforcement Learning Agent Actions","authors":"Shinya Masadome, Taku Harada","doi":"10.1002/tee.70005","DOIUrl":null,"url":null,"abstract":"<p>Reinforcement learning (RL) has found applications across diverse domains; however, it grapples with challenges when formulating reward functions and exhibits low exploration efficiency. Recent studies leveraging large language models (LLMs) have made strides in addressing these issues. However, for RL agents to be practically deployable, elucidating their decision-making process is crucial for enhancing explainability. We introduce a novel RL approach aimed at alleviating the burden of designing reward functions and facilitating natural language explanations for actions grounded in the agent's decisions. Our method employs two types of agents: a low-level agent responsible for concrete action selection and a high-level agent tasked with setting abstract action goals. The high-level agent undergoes training using a hybrid reward function framework, which incentivizes its actions by comparing them with those generated by an LLM across discretized states. Meanwhile, the training of the low-level agent is guided by a reward function designed using the EUREKA algorithm. We applied the proposed method to the cart-pole problem and demonstrated its ability to achieve a learning convergence rate while reducing human effort. Moreover, our approach yields coherent natural language explanations elucidating the rationale behind the agent's actions. © 2025 The Author(s). <i>IEEJ Transactions on Electrical and Electronic Engineering</i> published by Institute of Electrical Engineers of Japan and Wiley Periodicals LLC.</p>","PeriodicalId":13435,"journal":{"name":"IEEJ Transactions on Electrical and Electronic Engineering","volume":"20 8","pages":"1203-1211"},"PeriodicalIF":1.1000,"publicationDate":"2025-03-06","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://onlinelibrary.wiley.com/doi/epdf/10.1002/tee.70005","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"IEEJ Transactions on Electrical and Electronic Engineering","FirstCategoryId":"5","ListUrlMain":"https://onlinelibrary.wiley.com/doi/10.1002/tee.70005","RegionNum":4,"RegionCategory":"工程技术","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q4","JCRName":"ENGINEERING, ELECTRICAL & ELECTRONIC","Score":null,"Total":0}

引用次数: 0

Abstract

Reinforcement learning (RL) has found applications across diverse domains; however, it grapples with challenges when formulating reward functions and exhibits low exploration efficiency. Recent studies leveraging large language models (LLMs) have made strides in addressing these issues. However, for RL agents to be practically deployable, elucidating their decision-making process is crucial for enhancing explainability. We introduce a novel RL approach aimed at alleviating the burden of designing reward functions and facilitating natural language explanations for actions grounded in the agent's decisions. Our method employs two types of agents: a low-level agent responsible for concrete action selection and a high-level agent tasked with setting abstract action goals. The high-level agent undergoes training using a hybrid reward function framework, which incentivizes its actions by comparing them with those generated by an LLM across discretized states. Meanwhile, the training of the low-level agent is guided by a reward function designed using the EUREKA algorithm. We applied the proposed method to the cart-pole problem and demonstrated its ability to achieve a learning convergence rate while reducing human effort. Moreover, our approach yields coherent natural language explanations elucidating the rationale behind the agent's actions. © 2025 The Author(s). IEEJ Transactions on Electrical and Electronic Engineering published by Institute of Electrical Engineers of Japan and Wiley Periodicals LLC.

Abstract Image

查看原文本刊更多论文

基于大语言模型的奖励设计用于强化学习智能体行为的自然语言解释

强化学习（RL）已经在各个领域得到了应用；然而，该方法在制定奖励函数时存在诸多问题，且勘探效率较低。最近利用大型语言模型（llm）的研究在解决这些问题方面取得了进展。然而，为了使RL代理实际可部署，阐明其决策过程对于增强可解释性至关重要。我们引入了一种新的强化学习方法，旨在减轻设计奖励函数的负担，并促进基于代理决策的行为的自然语言解释。我们的方法使用了两种类型的代理：负责具体动作选择的低级代理和负责设置抽象动作目标的高级代理。高级智能体使用混合奖励函数框架进行训练，该框架通过将其与LLM在离散状态下生成的行为进行比较来激励其行为。同时，利用EUREKA算法设计的奖励函数指导低级智能体的训练。我们将提出的方法应用于车杆问题，并证明了它在减少人力的同时实现学习收敛率的能力。此外，我们的方法产生了连贯的自然语言解释，阐明了代理行为背后的基本原理。©2025作者。电气与电子工程学报，日本电气工程师学会和Wiley期刊公司出版。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

求助全文

约1分钟内获得全文求助全文

来源期刊

IEEJ Transactions on Electrical and Electronic Engineering 工程技术-工程：电子与电气

CiteScore

2.70

自引率

10.00%

发文量

199

审稿时长

4.3 months

期刊介绍： IEEJ Transactions on Electrical and Electronic Engineering (hereinafter called TEEE ) publishes 6 times per year as an official journal of the Institute of Electrical Engineers of Japan (hereinafter "IEEJ"). This peer-reviewed journal contains original research papers and review articles on the most important and latest technological advances in core areas of Electrical and Electronic Engineering and in related disciplines. The journal also publishes short communications reporting on the results of the latest research activities TEEE ) aims to provide a new forum for IEEJ members in Japan as well as fellow researchers in Electrical and Electronic Engineering from around the world to exchange ideas and research findings.