{"title":"将强化学习应用于即时战略游戏《星际争霸:母巢之战》中的小规模战斗","authors":"S. Wender, I. Watson","doi":"10.1109/CIG.2012.6374183","DOIUrl":null,"url":null,"abstract":"This paper presents an evaluation of the suitability of reinforcement learning (RL) algorithms to perform the task of micro-managing combat units in the commercial real-time strategy (RTS) game StarCraft:Broodwar (SC:BW). The applied techniques are variations of the common Q-learning and Sarsa algorithms, both simple one-step versions as well as more sophisticated versions that use eligibility traces to offset the problem of delayed reward. The aim is the design of an agent that is able to learn in an unsupervised manner in a complex environment, eventually taking over tasks that had previously been performed by non-adaptive, deterministic game AI. The preliminary results presented in this paper show the viability of the RL algorithms at learning the selected task. Depending on whether the focus lies on maximizing the reward or on the speed of learning, among the evaluated algorithms one-step Q-learning and Sarsa(λ) prove best at learning to manage combat units.","PeriodicalId":288052,"journal":{"name":"2012 IEEE Conference on Computational Intelligence and Games (CIG)","volume":"7 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2012-12-06","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"104","resultStr":"{\"title\":\"Applying reinforcement learning to small scale combat in the real-time strategy game StarCraft:Broodwar\",\"authors\":\"S. Wender, I. Watson\",\"doi\":\"10.1109/CIG.2012.6374183\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"This paper presents an evaluation of the suitability of reinforcement learning (RL) algorithms to perform the task of micro-managing combat units in the commercial real-time strategy (RTS) game StarCraft:Broodwar (SC:BW). The applied techniques are variations of the common Q-learning and Sarsa algorithms, both simple one-step versions as well as more sophisticated versions that use eligibility traces to offset the problem of delayed reward. The aim is the design of an agent that is able to learn in an unsupervised manner in a complex environment, eventually taking over tasks that had previously been performed by non-adaptive, deterministic game AI. The preliminary results presented in this paper show the viability of the RL algorithms at learning the selected task. Depending on whether the focus lies on maximizing the reward or on the speed of learning, among the evaluated algorithms one-step Q-learning and Sarsa(λ) prove best at learning to manage combat units.\",\"PeriodicalId\":288052,\"journal\":{\"name\":\"2012 IEEE Conference on Computational Intelligence and Games (CIG)\",\"volume\":\"7 1\",\"pages\":\"0\"},\"PeriodicalIF\":0.0000,\"publicationDate\":\"2012-12-06\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"104\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"2012 IEEE Conference on Computational Intelligence and Games (CIG)\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/10.1109/CIG.2012.6374183\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"2012 IEEE Conference on Computational Intelligence and Games (CIG)","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/CIG.2012.6374183","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
Applying reinforcement learning to small scale combat in the real-time strategy game StarCraft:Broodwar
This paper presents an evaluation of the suitability of reinforcement learning (RL) algorithms to perform the task of micro-managing combat units in the commercial real-time strategy (RTS) game StarCraft:Broodwar (SC:BW). The applied techniques are variations of the common Q-learning and Sarsa algorithms, both simple one-step versions as well as more sophisticated versions that use eligibility traces to offset the problem of delayed reward. The aim is the design of an agent that is able to learn in an unsupervised manner in a complex environment, eventually taking over tasks that had previously been performed by non-adaptive, deterministic game AI. The preliminary results presented in this paper show the viability of the RL algorithms at learning the selected task. Depending on whether the focus lies on maximizing the reward or on the speed of learning, among the evaluated algorithms one-step Q-learning and Sarsa(λ) prove best at learning to manage combat units.