Navigation in Adversarial Environments Guided by PRA* and a Local RL Planner

Proceedings of the AAAI Conference on Artificial Intelligence and Interactive Digital Entertainment Pub Date : 2023-10-06 DOI:10.1609/aiide.v19i1.27530

Debraj Ray, Nathan R. Sturtevant

{"title":"Navigation in Adversarial Environments Guided by PRA* and a Local RL Planner","authors":"Debraj Ray, Nathan R. Sturtevant","doi":"10.1609/aiide.v19i1.27530","DOIUrl":null,"url":null,"abstract":"Real-time strategy games require players to respond to short-term challenges (micromanagement) and long-term objectives (macromanagement) simultaneously to win. However, many players excel at one of these skills but not both. This research is motivated by the question of whether the burden of micromanagement can be reduced on human players through delegation of responsibility to autonomous agents. In particular, this research proposes an adversarial navigation architecture that enables units to autonomously navigate through places densely populated with enemies by learning to micromanage itself. Our approach models the adversarial pathfinding problem as a Markov Decision Process (MDP) and trains an agent with reinforcement learning on this MDP. We observed that our approach resulted in the agent taking less damage from adversaries while travelling shorter paths, compared to previous approaches for adversarial navigation. Our approach is also efficient in memory use and computation time. Interestingly, the agent using the proposed approach also outperformed baseline approaches while navigating through environments that are significantly different from the training environments. Furthermore, when the game design is modified, the agent discovers effective alternate strategies considering the updated design without any changes in the learning framework. This property is particularly useful because in game development the design of a game is often updated iteratively.","PeriodicalId":498041,"journal":{"name":"Proceedings of the AAAI Conference on Artificial Intelligence and Interactive Digital Entertainment","volume":"49 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2023-10-06","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Proceedings of the AAAI Conference on Artificial Intelligence and Interactive Digital Entertainment","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1609/aiide.v19i1.27530","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}

引用次数: 0

Abstract

Real-time strategy games require players to respond to short-term challenges (micromanagement) and long-term objectives (macromanagement) simultaneously to win. However, many players excel at one of these skills but not both. This research is motivated by the question of whether the burden of micromanagement can be reduced on human players through delegation of responsibility to autonomous agents. In particular, this research proposes an adversarial navigation architecture that enables units to autonomously navigate through places densely populated with enemies by learning to micromanage itself. Our approach models the adversarial pathfinding problem as a Markov Decision Process (MDP) and trains an agent with reinforcement learning on this MDP. We observed that our approach resulted in the agent taking less damage from adversaries while travelling shorter paths, compared to previous approaches for adversarial navigation. Our approach is also efficient in memory use and computation time. Interestingly, the agent using the proposed approach also outperformed baseline approaches while navigating through environments that are significantly different from the training environments. Furthermore, when the game design is modified, the agent discovers effective alternate strategies considering the updated design without any changes in the learning framework. This property is particularly useful because in game development the design of a game is often updated iteratively.

查看原文本刊更多论文

基于PRA*和局部RL规划器的对抗环境导航

即时战略游戏要求玩家同时应对短期挑战(微观管理)和长期目标(宏观管理)才能获胜。然而，许多玩家擅长其中一项技能，而不是两项技能。这项研究的动机是，是否可以通过将责任委托给自主代理来减少人类参与者的微观管理负担。特别是，这项研究提出了一种对抗性导航架构，使单位能够通过学习微观管理自己来自主导航敌人密集的地方。我们的方法将对抗性寻径问题建模为马尔可夫决策过程(MDP)，并在该MDP上训练具有强化学习的智能体。我们观察到，与之前的对抗性导航方法相比，我们的方法导致代理在行进更短的路径时受到对手的伤害更少。我们的方法在内存使用和计算时间方面也很有效。有趣的是，当智能体在与训练环境有显著不同的环境中导航时，使用该方法的智能体也优于基线方法。此外，当游戏设计被修改时，智能体在不改变学习框架的情况下发现有效的替代策略。这个属性特别有用，因为在游戏开发中，游戏设计经常是迭代更新的。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

求助全文

约1分钟内获得全文求助全文

来源期刊

Proceedings of the AAAI Conference on Artificial Intelligence and Interactive Digital Entertainment

自引率

0.00%

发文量