{"title":"一种新的机器人足球强化学习算法","authors":"M. Yoon, J. Bekker, Steve Kroon","doi":"10.5784/33-1-542","DOIUrl":null,"url":null,"abstract":"Reinforcement Learning (RL) is a powerful technique to develop intelligent agents in the field of Artificial Intelligence (AI). This paper proposes a new RL algorithm called the Temporal-Difference value iteration algorithm with state-value functions and presents applications of this algorithm to the decision-making problems challenged in the RoboCup Small Size League (SSL) domain. Six scenarios were defined to develop shooting skills for an SSL soccer robot in various situations using the proposed algorithm. Furthermore, an Artificial Neural Network (ANN) model, namely Multi-Layer Perceptron (MLP) was used as a function approximator in each application. The experimental results showed that the proposed RL algorithm had effectively trained the RL agent to acquire good shooting skills. The RL agent showed good performance under specified experimental conditions.","PeriodicalId":30587,"journal":{"name":"ORiON","volume":"158 1","pages":"1-20"},"PeriodicalIF":0.0000,"publicationDate":"2017-06-16","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"5","resultStr":"{\"title\":\"New reinforcement learning algorithm for robot soccer\",\"authors\":\"M. Yoon, J. Bekker, Steve Kroon\",\"doi\":\"10.5784/33-1-542\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"Reinforcement Learning (RL) is a powerful technique to develop intelligent agents in the field of Artificial Intelligence (AI). This paper proposes a new RL algorithm called the Temporal-Difference value iteration algorithm with state-value functions and presents applications of this algorithm to the decision-making problems challenged in the RoboCup Small Size League (SSL) domain. Six scenarios were defined to develop shooting skills for an SSL soccer robot in various situations using the proposed algorithm. Furthermore, an Artificial Neural Network (ANN) model, namely Multi-Layer Perceptron (MLP) was used as a function approximator in each application. The experimental results showed that the proposed RL algorithm had effectively trained the RL agent to acquire good shooting skills. The RL agent showed good performance under specified experimental conditions.\",\"PeriodicalId\":30587,\"journal\":{\"name\":\"ORiON\",\"volume\":\"158 1\",\"pages\":\"1-20\"},\"PeriodicalIF\":0.0000,\"publicationDate\":\"2017-06-16\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"5\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"ORiON\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/10.5784/33-1-542\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"ORiON","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.5784/33-1-542","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
New reinforcement learning algorithm for robot soccer
Reinforcement Learning (RL) is a powerful technique to develop intelligent agents in the field of Artificial Intelligence (AI). This paper proposes a new RL algorithm called the Temporal-Difference value iteration algorithm with state-value functions and presents applications of this algorithm to the decision-making problems challenged in the RoboCup Small Size League (SSL) domain. Six scenarios were defined to develop shooting skills for an SSL soccer robot in various situations using the proposed algorithm. Furthermore, an Artificial Neural Network (ANN) model, namely Multi-Layer Perceptron (MLP) was used as a function approximator in each application. The experimental results showed that the proposed RL algorithm had effectively trained the RL agent to acquire good shooting skills. The RL agent showed good performance under specified experimental conditions.