{"title":"Learning to shoot in first person shooter games by stabilizing actions and clustering rewards for reinforcement learning","authors":"F. Glavin, M. G. Madden","doi":"10.1109/CIG.2015.7317928","DOIUrl":null,"url":null,"abstract":"While reinforcement learning (RL) has been applied to turn-based board games for many years, more complex games involving decision-making in real-time are beginning to receive more attention. A challenge in such environments is that the time that elapses between deciding to take an action and receiving a reward based on its outcome can be longer than the interval between successive decisions. We explore this in the context of a non-player character (NPC) in a modern first-person shooter game. Such games take place in 3D environments where players, both human and computer-controlled, compete by engaging in combat and completing task objectives. We investigate the use of RL to enable NPCs to gather experience from game-play and improve their shooting skill over time from a reward signal based on the damage caused to opponents. We propose a new method for RL updates and reward calculations, in which the updates are carried out periodically, after each shooting encounter has ended, and a new weighted-reward mechanism is used which increases the reward applied to actions that lead to damaging the opponent in successive hits in what we term “hit clusters”.","PeriodicalId":244862,"journal":{"name":"2015 IEEE Conference on Computational Intelligence and Games (CIG)","volume":"138 3 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2015-11-05","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"13","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"2015 IEEE Conference on Computational Intelligence and Games (CIG)","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/CIG.2015.7317928","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 13
Abstract
While reinforcement learning (RL) has been applied to turn-based board games for many years, more complex games involving decision-making in real-time are beginning to receive more attention. A challenge in such environments is that the time that elapses between deciding to take an action and receiving a reward based on its outcome can be longer than the interval between successive decisions. We explore this in the context of a non-player character (NPC) in a modern first-person shooter game. Such games take place in 3D environments where players, both human and computer-controlled, compete by engaging in combat and completing task objectives. We investigate the use of RL to enable NPCs to gather experience from game-play and improve their shooting skill over time from a reward signal based on the damage caused to opponents. We propose a new method for RL updates and reward calculations, in which the updates are carried out periodically, after each shooting encounter has ended, and a new weighted-reward mechanism is used which increases the reward applied to actions that lead to damaging the opponent in successive hits in what we term “hit clusters”.