{"title":"通过学习合作单位的行为创造大量的游戏ai","authors":"Stephen Wiens, J. Denzinger, Sanjeev Paskaradevan","doi":"10.1109/CIG.2013.6633608","DOIUrl":null,"url":null,"abstract":"We present two improvements to the hybrid learning method for the shout-ahead architecture for units in the game Battle for Wesnoth. The shout-ahead architecture allows for units to perform decision making in two stages, first determining an action without knowledge of the intentions of other units, then, after communicating the intended action and likewise receiving the intentions of the other units, taking these intentions into account for the final decision on the next action. The decision making uses two rule sets and reinforcement learning is used to learn rule weights (that influence decision making), while evolutionary learning is used to evolve good rule sets. Our improvements add knowledge about terrain to the learning and also evaluate unit behaviors on several scenario maps to learn more general rules. The use of terrain knowledge resulted in improvements in the win percentage of evolved teams between 3 and 14 percentage points for different maps, while using several maps to learn from resulted in nearly similar win percentages on maps not learned from as on the maps learned from.","PeriodicalId":158902,"journal":{"name":"2013 IEEE Conference on Computational Inteligence in Games (CIG)","volume":"56 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2013-10-17","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"5","resultStr":"{\"title\":\"Creating large numbers of game AIs by learning behavior for cooperating units\",\"authors\":\"Stephen Wiens, J. Denzinger, Sanjeev Paskaradevan\",\"doi\":\"10.1109/CIG.2013.6633608\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"We present two improvements to the hybrid learning method for the shout-ahead architecture for units in the game Battle for Wesnoth. The shout-ahead architecture allows for units to perform decision making in two stages, first determining an action without knowledge of the intentions of other units, then, after communicating the intended action and likewise receiving the intentions of the other units, taking these intentions into account for the final decision on the next action. The decision making uses two rule sets and reinforcement learning is used to learn rule weights (that influence decision making), while evolutionary learning is used to evolve good rule sets. Our improvements add knowledge about terrain to the learning and also evaluate unit behaviors on several scenario maps to learn more general rules. The use of terrain knowledge resulted in improvements in the win percentage of evolved teams between 3 and 14 percentage points for different maps, while using several maps to learn from resulted in nearly similar win percentages on maps not learned from as on the maps learned from.\",\"PeriodicalId\":158902,\"journal\":{\"name\":\"2013 IEEE Conference on Computational Inteligence in Games (CIG)\",\"volume\":\"56 1\",\"pages\":\"0\"},\"PeriodicalIF\":0.0000,\"publicationDate\":\"2013-10-17\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"5\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"2013 IEEE Conference on Computational Inteligence in Games (CIG)\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/10.1109/CIG.2013.6633608\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"2013 IEEE Conference on Computational Inteligence in Games (CIG)","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/CIG.2013.6633608","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 5
摘要
我们对游戏《Battle for Wesnoth》中单位的呼喊提前架构的混合学习方法进行了两项改进。提前喊话架构允许单位分两个阶段执行决策,首先在不知道其他单位意图的情况下决定行动,然后在传达预期行动并同样接收到其他单位的意图后,将这些意图考虑到下一个行动的最终决定。决策使用两个规则集,强化学习用于学习规则权重(影响决策),而进化学习用于进化好的规则集。我们的改进增加了关于地形的知识,并在几个场景地图上评估单位的行为,以学习更多的一般规则。地形知识的使用使进化团队在不同地图上的胜率提高了3到14个百分点,而使用几张地图进行学习,在没有学习的地图上的胜率和学习过的地图上的胜率几乎相同。
Creating large numbers of game AIs by learning behavior for cooperating units
We present two improvements to the hybrid learning method for the shout-ahead architecture for units in the game Battle for Wesnoth. The shout-ahead architecture allows for units to perform decision making in two stages, first determining an action without knowledge of the intentions of other units, then, after communicating the intended action and likewise receiving the intentions of the other units, taking these intentions into account for the final decision on the next action. The decision making uses two rule sets and reinforcement learning is used to learn rule weights (that influence decision making), while evolutionary learning is used to evolve good rule sets. Our improvements add knowledge about terrain to the learning and also evaluate unit behaviors on several scenario maps to learn more general rules. The use of terrain knowledge resulted in improvements in the win percentage of evolved teams between 3 and 14 percentage points for different maps, while using several maps to learn from resulted in nearly similar win percentages on maps not learned from as on the maps learned from.