半场进攻环境下基于Sarsa的守门员策略基线分析

2020 19th Brazilian Symposium on Computer Games and Digital Entertainment (SBGames) Pub Date : 2020-11-01 DOI:10.1109/SBGames51465.2020.00012

V. G. F. Barbosa, R. Neto, Roberto V. L. Gomes Rodrigues

{"title":"半场进攻环境下基于Sarsa的守门员策略基线分析","authors":"V. G. F. Barbosa, R. Neto, Roberto V. L. Gomes Rodrigues","doi":"10.1109/SBGames51465.2020.00012","DOIUrl":null,"url":null,"abstract":"Much research in RoboCup 2D Soccer Simulation has used the Half Field Offense (HFO) environment. This work proposes a baseline approach for goalkeeper strategy using Reinforcement Learning on HFO. The proposed approach uses Sarsa with eligibility traces and Tile Coding for the discretization of state variables. Two comparative studies were conducted to validate the proposed baseline. First, a comparative study between the Agent2D's goalkeeper strategy and a random decision strategy was performed. The second comparative study verified the performance of the proposed approach against a random decision strategy. Wilcoxon's Signed-Rank test was used for measuring the statistical significance of performance differences. Experiments showed that the Agent2D's goalkeeper strategy is inferior to a random decision, and the proposed baseline delivers a performance superior to a random decision strategy with a confidence level of 95%.","PeriodicalId":335816,"journal":{"name":"2020 19th Brazilian Symposium on Computer Games and Digital Entertainment (SBGames)","volume":"4 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2020-11-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"1","resultStr":"{\"title\":\"A Baseline Approach for Goalkeeper Strategy using Sarsa with Tile Coding on the Half Field Offense Environment\",\"authors\":\"V. G. F. Barbosa, R. Neto, Roberto V. L. Gomes Rodrigues\",\"doi\":\"10.1109/SBGames51465.2020.00012\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"Much research in RoboCup 2D Soccer Simulation has used the Half Field Offense (HFO) environment. This work proposes a baseline approach for goalkeeper strategy using Reinforcement Learning on HFO. The proposed approach uses Sarsa with eligibility traces and Tile Coding for the discretization of state variables. Two comparative studies were conducted to validate the proposed baseline. First, a comparative study between the Agent2D's goalkeeper strategy and a random decision strategy was performed. The second comparative study verified the performance of the proposed approach against a random decision strategy. Wilcoxon's Signed-Rank test was used for measuring the statistical significance of performance differences. Experiments showed that the Agent2D's goalkeeper strategy is inferior to a random decision, and the proposed baseline delivers a performance superior to a random decision strategy with a confidence level of 95%.\",\"PeriodicalId\":335816,\"journal\":{\"name\":\"2020 19th Brazilian Symposium on Computer Games and Digital Entertainment (SBGames)\",\"volume\":\"4 1\",\"pages\":\"0\"},\"PeriodicalIF\":0.0000,\"publicationDate\":\"2020-11-01\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"1\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"2020 19th Brazilian Symposium on Computer Games and Digital Entertainment (SBGames)\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/10.1109/SBGames51465.2020.00012\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"2020 19th Brazilian Symposium on Computer Games and Digital Entertainment (SBGames)","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/SBGames51465.2020.00012","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}

引用次数: 1

摘要

在机器人世界杯的2D足球模拟中，很多研究都使用了半场进攻(HFO)环境。本研究提出了一种基于HFO强化学习的守门员策略基线方法。提出的方法使用Sarsa与资格跟踪和Tile编码的状态变量离散化。进行了两项比较研究来验证所建议的基线。首先，对Agent2D的守门员策略和随机决策策略进行了比较研究。第二个比较研究验证了所提出的方法对随机决策策略的性能。采用Wilcoxon’s Signed-Rank检验来衡量成绩差异的统计学显著性。实验表明，Agent2D的守门员策略优于随机决策策略，提出的基线在95%置信水平下优于随机决策策略。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

查看原文本刊更多论文

A Baseline Approach for Goalkeeper Strategy using Sarsa with Tile Coding on the Half Field Offense Environment

Much research in RoboCup 2D Soccer Simulation has used the Half Field Offense (HFO) environment. This work proposes a baseline approach for goalkeeper strategy using Reinforcement Learning on HFO. The proposed approach uses Sarsa with eligibility traces and Tile Coding for the discretization of state variables. Two comparative studies were conducted to validate the proposed baseline. First, a comparative study between the Agent2D's goalkeeper strategy and a random decision strategy was performed. The second comparative study verified the performance of the proposed approach against a random decision strategy. Wilcoxon's Signed-Rank test was used for measuring the statistical significance of performance differences. Experiments showed that the Agent2D's goalkeeper strategy is inferior to a random decision, and the proposed baseline delivers a performance superior to a random decision strategy with a confidence level of 95%.

求助全文

通过发布文献求助，成功后即可免费获取论文全文。去求助

来源期刊

2020 19th Brazilian Symposium on Computer Games and Digital Entertainment (SBGames)

自引率

0.00%

发文量