竞技游戏中战略和战术的多代理双层强化学习

Q3 Mathematics
Chengping Yuan , Md Abdullah Al Forhad , Ranak Bansal , Anna Sidorova , Mark V. Albert
{"title":"竞技游戏中战略和战术的多代理双层强化学习","authors":"Chengping Yuan ,&nbsp;Md Abdullah Al Forhad ,&nbsp;Ranak Bansal ,&nbsp;Anna Sidorova ,&nbsp;Mark V. Albert","doi":"10.1016/j.rico.2024.100471","DOIUrl":null,"url":null,"abstract":"<div><p>Reinforcement learning has been used extensively to learn the low-level tactical choices during gameplay; however, less effort is invested in the strategic decisions governing the effective engagement of a diverse set of opponents. In this paper, a two-tier reinforcement learning model is created to play competitive games and effectively engage in matches with different opponents to maximize earnings. The multi-agent environment has four types of learners, which vary in their ability to learn gameplay directly (tactics) and their ability to learn to bet or withdraw from gameplay (strategy). The players are tested in three different competitive games: Connect 4, Dots and Boxes, and Tic-Tac-Toe. Analyzing the behavior of players as they progress from naivety to game mastery reveals some interesting features: (1) learners who optimize strategy and tactics outperform all learners, (2) learners who initially optimize their strategy to engage in matches outperform those who focus on optimizing tactical gameplay, and (3) the advantage of strategy optimization versus tactical gameplay optimization diminishes as more games are played. A reinforcement learning model with a dual learning scheme presents possible applications in adversarial scenarios where both strategic and tactical learning are critical. We present detailed results in a systematic manner, providing strong support for our claim.</p></div>","PeriodicalId":34733,"journal":{"name":"Results in Control and Optimization","volume":"16 ","pages":"Article 100471"},"PeriodicalIF":0.0000,"publicationDate":"2024-09-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.sciencedirect.com/science/article/pii/S2666720724001012/pdfft?md5=d89231a731f9cb6cfdcb9dcc91471705&pid=1-s2.0-S2666720724001012-main.pdf","citationCount":"0","resultStr":"{\"title\":\"Multi-agent Dual Level Reinforcement Learning of Strategy and Tactics in Competitive Games\",\"authors\":\"Chengping Yuan ,&nbsp;Md Abdullah Al Forhad ,&nbsp;Ranak Bansal ,&nbsp;Anna Sidorova ,&nbsp;Mark V. Albert\",\"doi\":\"10.1016/j.rico.2024.100471\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"<div><p>Reinforcement learning has been used extensively to learn the low-level tactical choices during gameplay; however, less effort is invested in the strategic decisions governing the effective engagement of a diverse set of opponents. In this paper, a two-tier reinforcement learning model is created to play competitive games and effectively engage in matches with different opponents to maximize earnings. The multi-agent environment has four types of learners, which vary in their ability to learn gameplay directly (tactics) and their ability to learn to bet or withdraw from gameplay (strategy). The players are tested in three different competitive games: Connect 4, Dots and Boxes, and Tic-Tac-Toe. Analyzing the behavior of players as they progress from naivety to game mastery reveals some interesting features: (1) learners who optimize strategy and tactics outperform all learners, (2) learners who initially optimize their strategy to engage in matches outperform those who focus on optimizing tactical gameplay, and (3) the advantage of strategy optimization versus tactical gameplay optimization diminishes as more games are played. A reinforcement learning model with a dual learning scheme presents possible applications in adversarial scenarios where both strategic and tactical learning are critical. We present detailed results in a systematic manner, providing strong support for our claim.</p></div>\",\"PeriodicalId\":34733,\"journal\":{\"name\":\"Results in Control and Optimization\",\"volume\":\"16 \",\"pages\":\"Article 100471\"},\"PeriodicalIF\":0.0000,\"publicationDate\":\"2024-09-01\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"https://www.sciencedirect.com/science/article/pii/S2666720724001012/pdfft?md5=d89231a731f9cb6cfdcb9dcc91471705&pid=1-s2.0-S2666720724001012-main.pdf\",\"citationCount\":\"0\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"Results in Control and Optimization\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://www.sciencedirect.com/science/article/pii/S2666720724001012\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"Q3\",\"JCRName\":\"Mathematics\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"Results in Control and Optimization","FirstCategoryId":"1085","ListUrlMain":"https://www.sciencedirect.com/science/article/pii/S2666720724001012","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q3","JCRName":"Mathematics","Score":null,"Total":0}
引用次数: 0

摘要

强化学习已被广泛应用于学习游戏过程中的低级战术选择;然而,在管理与不同对手有效交战的战略决策方面投入的精力却较少。本文创建了一个双层强化学习模型,用于玩竞技游戏,并有效地与不同对手进行比赛,以实现收益最大化。多代理环境中有四种类型的学习者,它们在直接学习游戏玩法(战术)和学习下注或退出游戏玩法(策略)的能力上各不相同。玩家在三种不同的竞技游戏中接受测试:4 号连线、点和方块以及井字游戏。分析玩家从幼稚到精通游戏的过程中的行为,可以发现一些有趣的特点:(1) 优化战略战术的学习者优于所有学习者;(2) 最初优化战略以参与比赛的学习者优于专注于优化战术游戏的学习者;(3) 战略优化与战术游戏优化的优势随着游戏的增多而减弱。具有双重学习方案的强化学习模型可应用于对抗性场景,在这种场景中,战略和战术学习都至关重要。我们以系统的方式展示了详细的结果,为我们的主张提供了有力的支持。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
Multi-agent Dual Level Reinforcement Learning of Strategy and Tactics in Competitive Games

Reinforcement learning has been used extensively to learn the low-level tactical choices during gameplay; however, less effort is invested in the strategic decisions governing the effective engagement of a diverse set of opponents. In this paper, a two-tier reinforcement learning model is created to play competitive games and effectively engage in matches with different opponents to maximize earnings. The multi-agent environment has four types of learners, which vary in their ability to learn gameplay directly (tactics) and their ability to learn to bet or withdraw from gameplay (strategy). The players are tested in three different competitive games: Connect 4, Dots and Boxes, and Tic-Tac-Toe. Analyzing the behavior of players as they progress from naivety to game mastery reveals some interesting features: (1) learners who optimize strategy and tactics outperform all learners, (2) learners who initially optimize their strategy to engage in matches outperform those who focus on optimizing tactical gameplay, and (3) the advantage of strategy optimization versus tactical gameplay optimization diminishes as more games are played. A reinforcement learning model with a dual learning scheme presents possible applications in adversarial scenarios where both strategic and tactical learning are critical. We present detailed results in a systematic manner, providing strong support for our claim.

求助全文
通过发布文献求助,成功后即可免费获取论文全文。 去求助
来源期刊
Results in Control and Optimization
Results in Control and Optimization Mathematics-Control and Optimization
CiteScore
3.00
自引率
0.00%
发文量
51
审稿时长
91 days
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
确定
请完成安全验证×
copy
已复制链接
快去分享给好友吧!
我知道了
右上角分享
点击右上角分享
0
联系我们:info@booksci.cn Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。 Copyright © 2023 布克学术 All rights reserved.
京ICP备2023020795号-1
ghs 京公网安备 11010802042870号
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术官方微信