Using plan-based reward shaping to learn strategies in StarCraft: Broodwar

Kyriakos Efthymiadis, D. Kudenko
{"title":"Using plan-based reward shaping to learn strategies in StarCraft: Broodwar","authors":"Kyriakos Efthymiadis, D. Kudenko","doi":"10.1109/CIG.2013.6633622","DOIUrl":null,"url":null,"abstract":"StarCraft: Broodwar (SC:BW) is a very popular commercial real strategy game (RTS) which has been extensively used in AI research. Despite being a popular test-bed reinforcement learning (RL) has not been evaluated extensively. A successful attempt was made to show the use of RL in a small-scale combat scenario involving an overpowered agent battling against multiple enemy units [1]. However, the chosen scenario was very small and not representative of the complexity of the game in its entirety. In order to build an RL agent that can manage the complexity of the full game, more efficient approaches must be used to tackle the state-space explosion. In this paper, we demonstrate how plan-based reward shaping can help an agent scale up to larger, more complex scenarios and significantly speed up the learning process as well as how high level planning can be combined with learning focusing on learning the Starcraft strategy, Battlecruiser Rush. We empirically show that the agent with plan-based reward shaping is significantly better both in terms of the learnt policy, as well as convergence speed when compared to baseline approaches which fail at reaching a good enough policy within a practical amount of time.","PeriodicalId":158902,"journal":{"name":"2013 IEEE Conference on Computational Inteligence in Games (CIG)","volume":"15 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2013-10-17","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"24","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"2013 IEEE Conference on Computational Inteligence in Games (CIG)","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/CIG.2013.6633622","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 24

Abstract

StarCraft: Broodwar (SC:BW) is a very popular commercial real strategy game (RTS) which has been extensively used in AI research. Despite being a popular test-bed reinforcement learning (RL) has not been evaluated extensively. A successful attempt was made to show the use of RL in a small-scale combat scenario involving an overpowered agent battling against multiple enemy units [1]. However, the chosen scenario was very small and not representative of the complexity of the game in its entirety. In order to build an RL agent that can manage the complexity of the full game, more efficient approaches must be used to tackle the state-space explosion. In this paper, we demonstrate how plan-based reward shaping can help an agent scale up to larger, more complex scenarios and significantly speed up the learning process as well as how high level planning can be combined with learning focusing on learning the Starcraft strategy, Battlecruiser Rush. We empirically show that the agent with plan-based reward shaping is significantly better both in terms of the learnt policy, as well as convergence speed when compared to baseline approaches which fail at reaching a good enough policy within a practical amount of time.
在《星际争霸:母巢之战》中使用基于计划的奖励塑造来学习策略
星际争霸:母巢之战(SC:BW)是一款非常受欢迎的商业真实战略游戏(RTS),在人工智能研究中得到了广泛的应用。尽管强化学习(RL)是一种流行的测试平台,但尚未得到广泛的评估。一个成功的尝试是在小规模的战斗场景中展示RL的使用,其中涉及一个强大的代理与多个敌方单位作战[1]。然而,所选择的场景非常小,不能代表整个游戏的复杂性。为了构建一个能够管理整个游戏复杂性的强化学习代理,必须使用更有效的方法来处理状态空间爆炸。在本文中,我们展示了基于计划的奖励塑造如何帮助智能体扩展到更大、更复杂的场景,并显著加快学习过程,以及如何将高级计划与专注于学习《星际争霸》策略的学习结合起来。我们的经验表明,与在实际时间内无法达到足够好的策略的基线方法相比,具有基于计划的奖励塑造的智能体在学习策略和收敛速度方面都明显更好。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
求助全文
约1分钟内获得全文 求助全文
来源期刊
自引率
0.00%
发文量
0
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
确定
请完成安全验证×
copy
已复制链接
快去分享给好友吧!
我知道了
右上角分享
点击右上角分享
0
联系我们:info@booksci.cn Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。 Copyright © 2023 布克学术 All rights reserved.
京ICP备2023020795号-1
ghs 京公网安备 11010802042870号
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术官方微信