Solutions to finite horizon cost problems using actor-critic reinforcement learning

I. Grondman, Hao Xu, S. Jagannathan, Robert Babuška
{"title":"Solutions to finite horizon cost problems using actor-critic reinforcement learning","authors":"I. Grondman, Hao Xu, S. Jagannathan, Robert Babuška","doi":"10.1109/IJCNN.2013.6706755","DOIUrl":null,"url":null,"abstract":"Actor-critic reinforcement learning algorithms have shown to be a successful tool in learning the optimal control for a range of (repetitive) tasks on systems with (partially) unknown dynamics, which may or may not be nonlinear. Most of the reinforcement learning literature published up to this point only deals with modeling the task at hand as a Markov decision process with an infinite horizon cost function. In practice, however, it is sometimes desired to have a solution for the case where the cost function is defined over a finite horizon, which means that the optimal control problem will be time-varying and thus harder to solve. This paper adapts two previously introduced actor-critic algorithms from the infinite horizon setting to the finite horizon setting and applies them to learning a task on a nonlinear system, without needing any assumptions or knowledge about the system dynamics, using radial basis function networks. Simulations on a typical nonlinear motion control problem are carried out, showing that actor-critic algorithms are capable of solving the difficult problem of time-varying optimal control. Moreover, the benefit of using a model learning technique is shown.","PeriodicalId":376975,"journal":{"name":"The 2013 International Joint Conference on Neural Networks (IJCNN)","volume":"23 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2013-08-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"5","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"The 2013 International Joint Conference on Neural Networks (IJCNN)","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/IJCNN.2013.6706755","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 5

Abstract

Actor-critic reinforcement learning algorithms have shown to be a successful tool in learning the optimal control for a range of (repetitive) tasks on systems with (partially) unknown dynamics, which may or may not be nonlinear. Most of the reinforcement learning literature published up to this point only deals with modeling the task at hand as a Markov decision process with an infinite horizon cost function. In practice, however, it is sometimes desired to have a solution for the case where the cost function is defined over a finite horizon, which means that the optimal control problem will be time-varying and thus harder to solve. This paper adapts two previously introduced actor-critic algorithms from the infinite horizon setting to the finite horizon setting and applies them to learning a task on a nonlinear system, without needing any assumptions or knowledge about the system dynamics, using radial basis function networks. Simulations on a typical nonlinear motion control problem are carried out, showing that actor-critic algorithms are capable of solving the difficult problem of time-varying optimal control. Moreover, the benefit of using a model learning technique is shown.
有限视界成本问题的actor-critic强化学习解决方案
Actor-critic强化学习算法已被证明是一种成功的工具,可以在具有(部分)未知动态(可能是也可能不是非线性)的系统上学习一系列(重复)任务的最优控制。到目前为止,大多数发表的强化学习文献只涉及将手头的任务建模为具有无限视界成本函数的马尔可夫决策过程。然而,在实践中,有时需要对成本函数在有限范围内定义的情况有一个解决方案,这意味着最优控制问题将是时变的,因此更难解决。本文将先前介绍的两种actor-critic算法从无限视界引入到有限视界,并将其应用于非线性系统的任务学习,而不需要任何关于系统动力学的假设或知识,使用径向基函数网络。对一个典型的非线性运动控制问题进行了仿真,结果表明演员评价算法能够解决时变最优控制难题。此外,还展示了使用模型学习技术的好处。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
求助全文
约1分钟内获得全文 求助全文
来源期刊
自引率
0.00%
发文量
0
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
确定
请完成安全验证×
copy
已复制链接
快去分享给好友吧!
我知道了
右上角分享
点击右上角分享
0
联系我们:info@booksci.cn Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。 Copyright © 2023 布克学术 All rights reserved.
京ICP备2023020795号-1
ghs 京公网安备 11010802042870号
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术官方微信