求解倒立摆问题的最小二乘强化学习

2018 3rd International Conference on Computer and Communication Systems (ICCCS) Pub Date : 2018-04-01 DOI:10.1109/CCOMS.2018.8463234

Satta Panyakaew, Papangkorn Inkeaw, Jakramate Bootkrajang, Jeerayut Chaijaruwanich

{"title":"求解倒立摆问题的最小二乘强化学习","authors":"Satta Panyakaew, Papangkorn Inkeaw, Jakramate Bootkrajang, Jeerayut Chaijaruwanich","doi":"10.1109/CCOMS.2018.8463234","DOIUrl":null,"url":null,"abstract":"Inverted pendulum is one of the classic control problem that could be solved by reinforcement learning approach. Most of the previous work consider the problem in discrete state space with only few exceptions assume continuous state domain. In this paper, we consider the problem of cart-pole balancing in the continuous state space setup with constrained track length. We adopted a least square temporal difference reinforcement learning algorithm for learning the controller. A new reward function is then proposed to better reflect the nature of the task. In addition, we also studied various factors which play important roles in the success of the learning. The empirical studies validate the effectiveness of our method.","PeriodicalId":405664,"journal":{"name":"2018 3rd International Conference on Computer and Communication Systems (ICCCS)","volume":"4 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2018-04-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"2","resultStr":"{\"title\":\"Least Square Reinforcement Learning for Solving Inverted Pendulum Problem\",\"authors\":\"Satta Panyakaew, Papangkorn Inkeaw, Jakramate Bootkrajang, Jeerayut Chaijaruwanich\",\"doi\":\"10.1109/CCOMS.2018.8463234\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"Inverted pendulum is one of the classic control problem that could be solved by reinforcement learning approach. Most of the previous work consider the problem in discrete state space with only few exceptions assume continuous state domain. In this paper, we consider the problem of cart-pole balancing in the continuous state space setup with constrained track length. We adopted a least square temporal difference reinforcement learning algorithm for learning the controller. A new reward function is then proposed to better reflect the nature of the task. In addition, we also studied various factors which play important roles in the success of the learning. The empirical studies validate the effectiveness of our method.\",\"PeriodicalId\":405664,\"journal\":{\"name\":\"2018 3rd International Conference on Computer and Communication Systems (ICCCS)\",\"volume\":\"4 1\",\"pages\":\"0\"},\"PeriodicalIF\":0.0000,\"publicationDate\":\"2018-04-01\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"2\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"2018 3rd International Conference on Computer and Communication Systems (ICCCS)\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/10.1109/CCOMS.2018.8463234\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"2018 3rd International Conference on Computer and Communication Systems (ICCCS)","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/CCOMS.2018.8463234","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}

引用次数: 2

摘要

倒立摆是可以用强化学习方法解决的经典控制问题之一。以往的研究大多是在离散状态空间中考虑问题，只有少数例外是假设连续状态域。研究了具有约束轨迹长度的连续状态空间下的小车-杆平衡问题。我们采用最小二乘时间差分强化学习算法来学习控制器。然后提出一个新的奖励函数来更好地反映任务的性质。此外，我们还研究了对学习成功起重要作用的各种因素。实证研究验证了本文方法的有效性。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

查看原文本刊更多论文

Least Square Reinforcement Learning for Solving Inverted Pendulum Problem

Inverted pendulum is one of the classic control problem that could be solved by reinforcement learning approach. Most of the previous work consider the problem in discrete state space with only few exceptions assume continuous state domain. In this paper, we consider the problem of cart-pole balancing in the continuous state space setup with constrained track length. We adopted a least square temporal difference reinforcement learning algorithm for learning the controller. A new reward function is then proposed to better reflect the nature of the task. In addition, we also studied various factors which play important roles in the success of the learning. The empirical studies validate the effectiveness of our method.

求助全文

通过发布文献求助，成功后即可免费获取论文全文。去求助

来源期刊

2018 3rd International Conference on Computer and Communication Systems (ICCCS)

自引率

0.00%

发文量