IEEE Transactions on Games最新文献

筛选
英文 中文
Deep Multitask Multiagent Reinforcement Learning With Knowledge Transfer 带知识转移的深度多任务多代理强化学习
IF 1.7 4区 计算机科学
IEEE Transactions on Games Pub Date : 2023-09-19 DOI: 10.1109/TG.2023.3316697
Yuxiang Mai;Yifan Zang;Qiyue Yin;Wancheng Ni;Kaiqi Huang
{"title":"Deep Multitask Multiagent Reinforcement Learning With Knowledge Transfer","authors":"Yuxiang Mai;Yifan Zang;Qiyue Yin;Wancheng Ni;Kaiqi Huang","doi":"10.1109/TG.2023.3316697","DOIUrl":"10.1109/TG.2023.3316697","url":null,"abstract":"Despite the potential of multiagent reinforcement learning (MARL) in addressing numerous complex tasks, training a single team of MARL agents to handle multiple diverse team tasks remains a challenge. In this article, we introduce a novel Multitask method based on Knowledge Transfer in cooperative MARL (MKT-MARL). By learning from task-specific teachers, our approach empowers a single team of agents to attain expert-level performance in multiple tasks. MKT-MARL utilizes a knowledge distillation algorithm specifically designed for the multiagent architecture, which rapidly learns a team control policy incorporating common coordinated knowledge from the experience of task-specific teachers. In addition, we enhance this training with teacher annealing, gradually shifting the model's learning from distillation toward environmental rewards. This enhancement helps the multitask model surpass its single-task teachers. We extensively evaluate our algorithm using two commonly-used benchmarks: \u0000<italic>StarCraft II</i>\u0000 micromanagement and multiagent particle environment. The experimental results demonstrate that our algorithm outperforms both the single-task teachers and a jointly trained team of agents. Extensive ablation experiments illustrate the effectiveness of the supervised knowledge transfer and the teacher annealing strategy.","PeriodicalId":55977,"journal":{"name":"IEEE Transactions on Games","volume":"16 3","pages":"566-576"},"PeriodicalIF":1.7,"publicationDate":"2023-09-19","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"135554802","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Call for Papers—IEEE Transactions on Games Special Issue on Human-Centered AI in Game Evaluation 论文征集——IEEE游戏汇刊——游戏评估中以人为中心的人工智能特刊
IF 2.3 4区 计算机科学
IEEE Transactions on Games Pub Date : 2023-09-14 DOI: 10.1109/TG.2023.3312909
{"title":"Call for Papers—IEEE Transactions on Games Special Issue on Human-Centered AI in Game Evaluation","authors":"","doi":"10.1109/TG.2023.3312909","DOIUrl":"https://doi.org/10.1109/TG.2023.3312909","url":null,"abstract":"","PeriodicalId":55977,"journal":{"name":"IEEE Transactions on Games","volume":"15 3","pages":"492-492"},"PeriodicalIF":2.3,"publicationDate":"2023-09-14","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://ieeexplore.ieee.org/iel7/7782673/10251473/10251484.pdf","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"68027376","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
IEEE Computational Intelligence Society Information IEEE计算智能学会信息
IF 2.3 4区 计算机科学
IEEE Transactions on Games Pub Date : 2023-09-14 DOI: 10.1109/TG.2023.3310831
{"title":"IEEE Computational Intelligence Society Information","authors":"","doi":"10.1109/TG.2023.3310831","DOIUrl":"https://doi.org/10.1109/TG.2023.3310831","url":null,"abstract":"","PeriodicalId":55977,"journal":{"name":"IEEE Transactions on Games","volume":"15 3","pages":"C3-C3"},"PeriodicalIF":2.3,"publicationDate":"2023-09-14","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://ieeexplore.ieee.org/iel7/7782673/10251473/10251490.pdf","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"68027377","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
IEEE Transactions on Games Publication Information IEEE奥运会出版信息汇刊
IF 2.3 4区 计算机科学
IEEE Transactions on Games Pub Date : 2023-09-14 DOI: 10.1109/TG.2023.3310833
{"title":"IEEE Transactions on Games Publication Information","authors":"","doi":"10.1109/TG.2023.3310833","DOIUrl":"https://doi.org/10.1109/TG.2023.3310833","url":null,"abstract":"","PeriodicalId":55977,"journal":{"name":"IEEE Transactions on Games","volume":"15 3","pages":"C2-C2"},"PeriodicalIF":2.3,"publicationDate":"2023-09-14","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://ieeexplore.ieee.org/iel7/7782673/10251473/10251491.pdf","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"68026819","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Leveraging the OPT Large Language Model for Sentiment Analysis of Game Reviews 利用 OPT 大语言模型对游戏评论进行情感分析
IF 2.3 4区 计算机科学
IEEE Transactions on Games Pub Date : 2023-09-08 DOI: 10.1109/TG.2023.3313121
Markos Viggiato;Cor-Paul Bezemer
{"title":"Leveraging the OPT Large Language Model for Sentiment Analysis of Game Reviews","authors":"Markos Viggiato;Cor-Paul Bezemer","doi":"10.1109/TG.2023.3313121","DOIUrl":"10.1109/TG.2023.3313121","url":null,"abstract":"Automatically extracting players' sentiments about games can help game developers to better understand the aspects of their games that players like or dislike. Our prior work showed that traditional sentiment analysis techniques do not perform well on game reviews. However, the natural language processing field has seen a steep progress in recent years. In this letter, we follow up on our prior work and investigate how a state-of-the-art large language model (OPT-175B) performs on the sentiment classification of game reviews. We manually analyze the game reviews wrongly classified by OPT-175B to better understand the issues that affect the performance of that model and how those issues compare to the challenges faced by traditional classifiers. We found that OPT-175B achieves (far) better performance than traditional sentiment classifiers, with a 72%-increased \u0000<inline-formula><tex-math>$F$</tex-math></inline-formula>\u0000-measure and a 30%-increased AUC compared to the best traditional classifier studied in our prior work. We also found that common challenges of traditional classifiers, such as reviews with game comparisons and negative terminology, have been mostly solved by the OPT-175B model.","PeriodicalId":55977,"journal":{"name":"IEEE Transactions on Games","volume":"16 2","pages":"493-496"},"PeriodicalIF":2.3,"publicationDate":"2023-09-08","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"62570261","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
MCMARL: Parameterizing Value Function via Mixture of Categorical Distributions for Multi-Agent Reinforcement Learning 基于混合分类分布的多智能体强化学习参数化值函数
IF 1.7 4区 计算机科学
IEEE Transactions on Games Pub Date : 2023-08-30 DOI: 10.1109/TG.2023.3310150
Jian Zhao;Mingyu Yang;Youpeng Zhao;Xunhan Hu;Wengang Zhou;Houqiang Li
{"title":"MCMARL: Parameterizing Value Function via Mixture of Categorical Distributions for Multi-Agent Reinforcement Learning","authors":"Jian Zhao;Mingyu Yang;Youpeng Zhao;Xunhan Hu;Wengang Zhou;Houqiang Li","doi":"10.1109/TG.2023.3310150","DOIUrl":"10.1109/TG.2023.3310150","url":null,"abstract":"In cooperative multi-agent tasks, a team of agents jointly interact with an environment by taking actions, receiving a team reward, and observing the next state. During the interactions, the uncertainty of environment and reward will inevitably induce stochasticity in the long-term returns, and the randomness can be exacerbated with the increasing number of agents. However, such randomness is ignored by most of the existing value-based multi-agent reinforcement learning (MARL) methods, which only model the expectation of \u0000<inline-formula><tex-math>$Q$</tex-math></inline-formula>\u0000-value for both the individual agents and the team. Compared to using the expectations of the long-term returns, it is preferable to directly model the stochasticity by estimating the returns through distributions. With this motivation, this article proposes a novel value-based MARL framework from a distributional perspective, i.e., parameterizing value function via \u0000<underline>M</u>\u0000ixture of \u0000<underline>C</u>\u0000ategorical distributions for MARL (MCMARL). Specifically, we model both the individual and global \u0000<inline-formula><tex-math>$Q$</tex-math></inline-formula>\u0000-values with categorical distribution. To integrate categorical distributions, we define five basic operations on the distribution, which allow the generalization of expected value function factorization methods (e.g., value decomposition networks (VDN) and QMIX) to their MCMARL variants. We further prove that our MCMARL framework satisfies the \u0000<italic>Distributional-Individual-Global-Max</i>\u0000 principle with respect to the expectation of distribution, which guarantees the consistency between joint and individual greedy action selections in the global and individual \u0000<inline-formula><tex-math>$Q$</tex-math></inline-formula>\u0000-values. Empirically, we evaluate MCMARL on both the stochastic matrix game and the challenging set of \u0000<italic>StarCraft II</i>\u0000 micromanagement tasks, showing the efficacy of our framework.","PeriodicalId":55977,"journal":{"name":"IEEE Transactions on Games","volume":"16 3","pages":"556-565"},"PeriodicalIF":1.7,"publicationDate":"2023-08-30","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"47562284","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Deep Reinforcement Learning Using Optimized Monte Carlo Tree Search in EWN 在 EWN 中使用优化蒙特卡洛树搜索进行深度强化学习
IF 1.7 4区 计算机科学
IEEE Transactions on Games Pub Date : 2023-08-28 DOI: 10.1109/TG.2023.3308898
Yixian Zhang;Zhuoxuan Li;Yiding Cao;Xuan Zhao;Jinde Cao
{"title":"Deep Reinforcement Learning Using Optimized Monte Carlo Tree Search in EWN","authors":"Yixian Zhang;Zhuoxuan Li;Yiding Cao;Xuan Zhao;Jinde Cao","doi":"10.1109/TG.2023.3308898","DOIUrl":"10.1109/TG.2023.3308898","url":null,"abstract":"<italic>EinStein würfelt nicht!</i>\u0000 (EWN) is a perfect information stochastic game, in which randomness influences the game process enormously. In this article, we propose an optimized algorithm named Quick Neural Network Tree Search (QNNTS) based on deep reinforcement learning and Monte Carlo tree search (MCTS) to construct the artificial intelligence agent of EWN. Meanwhile, the lightness of the model makes it possible to train with much less computing resources. The optimization structure of the algorithm based on MCTS is named Optimized Upper Confidence Bound Applied to Tree with Heuristic Search, which introduces the expectation valuation strategy into the MCTS. As the prerequisite product of QNNTS, it performs with an improvement of the winning rate. Ultimately, the Attention-ResNet structure combined with domain knowledge is used to obtain the proposed algorithm. Compared with several conventional algorithms, it gains high winning rates of at least 68%.","PeriodicalId":55977,"journal":{"name":"IEEE Transactions on Games","volume":"16 3","pages":"544-555"},"PeriodicalIF":1.7,"publicationDate":"2023-08-28","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"62570246","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Multigoal Reinforcement Learning via Exploring Entropy-Regularized Successor Matching 通过探索熵细化后继匹配进行多目标强化学习
IF 2.3 4区 计算机科学
IEEE Transactions on Games Pub Date : 2023-08-11 DOI: 10.1109/TG.2023.3304315
Xiaoyun Feng;Yun Zhou
{"title":"Multigoal Reinforcement Learning via Exploring Entropy-Regularized Successor Matching","authors":"Xiaoyun Feng;Yun Zhou","doi":"10.1109/TG.2023.3304315","DOIUrl":"10.1109/TG.2023.3304315","url":null,"abstract":"Multigoal reinforcement learning (RL) algorithms tend to achieve and generalize over diverse goals. However, unlike single-goal agents, multigoal agents struggle to break through the exploration bottleneck with a fair share of interactions, owing to rarely reusable goal-oriented experiences with sparse goal-reaching rewards. Therefore, well-arranged behavior goals during training are essential for multigoal agents, especially in long-horizon tasks. To this end, we propose efficient multigoal exploration on the basis of maximizing the entropy of successor features and Exploring entropy-regularized successor matching, namely, E\u0000<inline-formula><tex-math>$^{2}$</tex-math></inline-formula>\u0000SM. E\u0000<inline-formula><tex-math>$^{2}$</tex-math></inline-formula>\u0000SM adopts the idea of a successor feature and extends it to entropy-regularized goal-reaching successor mapping that serves as a more stable state feature under sparse rewards. The key contribution of our work is to perform intrinsic goal setting with behavior goals that are more likely to be achieved in terms of future state occupancies as well as promising in expanding the exploration frontier. Experiments on challenging long-horizon manipulation tasks show that E\u0000<inline-formula><tex-math>$^{2}$</tex-math></inline-formula>\u0000SM deals well with sparse rewards and in pursuit of maximal state-covering, E\u0000<inline-formula><tex-math>$^{2}$</tex-math></inline-formula>\u0000SM efficiently identifies valuable behavior goals toward specific goal-reaching by matching the successor mapping.","PeriodicalId":55977,"journal":{"name":"IEEE Transactions on Games","volume":"15 4","pages":"538-548"},"PeriodicalIF":2.3,"publicationDate":"2023-08-11","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"62570212","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Leveraging Joint-Action Embedding in Multiagent Reinforcement Learning for Cooperative Games 利用多代理强化学习中的联合行动嵌入来实现合作游戏
IF 2.3 4区 计算机科学
IEEE Transactions on Games Pub Date : 2023-08-07 DOI: 10.1109/TG.2023.3302694
Xingzhou Lou;Junge Zhang;Yali Du;Chao Yu;Zhaofeng He;Kaiqi Huang
{"title":"Leveraging Joint-Action Embedding in Multiagent Reinforcement Learning for Cooperative Games","authors":"Xingzhou Lou;Junge Zhang;Yali Du;Chao Yu;Zhaofeng He;Kaiqi Huang","doi":"10.1109/TG.2023.3302694","DOIUrl":"10.1109/TG.2023.3302694","url":null,"abstract":"State-of-the-art multiagent policy gradient (MAPG) methods have demonstrated convincing capability in many cooperative games. However, the exponentially growing joint-action space severely challenges the critic's value evaluation and hinders performance of MAPG methods. To address this issue, we augment Central-Q policy gradient with a joint-action embedding function and propose mutual-information maximization MAPG (M3APG). The joint-action embedding function makes joint-actions contain information of state transitions, which will improve the critic's generalization over the joint-action space by allowing it to infer joint-actions' outcomes. We theoretically prove that with a fixed joint-action embedding function, the convergence of M3APG is guaranteed. Experiment results of the \u0000<italic>StarCraft</i>\u0000 multiagent challenge (SMAC) demonstrate that M3APG gives evaluation results with better accuracy and outperform other MAPG basic models across various maps of multiple difficulty levels. We empirically show that our joint-action embedding model can be extended to value-based multiagent reinforcement learning methods and state-of-the-art MAPG methods. Finally, we run an ablation study to show that the usage of mutual information in our method is necessary and effective.","PeriodicalId":55977,"journal":{"name":"IEEE Transactions on Games","volume":"16 2","pages":"470-482"},"PeriodicalIF":2.3,"publicationDate":"2023-08-07","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"62570196","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Improved Exploration With Demonstrations in Procedurally-Generated Environments 利用程序生成环境中的演示改进探索工作
IF 1.7 4区 计算机科学
IEEE Transactions on Games Pub Date : 2023-07-31 DOI: 10.1109/TG.2023.3299986
Mao Xu;Shuzhi Sam Ge;Dongjie Zhao;Qian Zhao
{"title":"Improved Exploration With Demonstrations in Procedurally-Generated Environments","authors":"Mao Xu;Shuzhi Sam Ge;Dongjie Zhao;Qian Zhao","doi":"10.1109/TG.2023.3299986","DOIUrl":"10.1109/TG.2023.3299986","url":null,"abstract":"Exploring sparse reward environments remains a major challenge in model-free deep reinforcement learning (RL). State-of-the-art exploration methods address this challenge by utilizing intrinsic rewards to guide exploration in uncertain environment dynamics or novel states. However, these methods fall short in procedurally-generated environments, where the agent is unlikely to visit a state more than once due to the different environments generated in each episode. Recently, imitation-learning-based exploration methods have been proposed to guide exploration in different kinds of procedurally-generated environments by imitating high-quality exploration episodes. However, these methods have weaker exploration capabilities and lower sample efficiency in complex procedurally-generated environments. Motivated by the fact that demonstrations can guide exploration in sparse reward environments, we propose improved exploration with demonstrations (IEWD), an improved imitation-learning-based exploration method in procedurally-generated environments, which utilizes demonstrations from these environments. IEWD assigns different episode-level exploration scores to each demonstration episode and generated episode. IEWD then ranks these episodes based on their scores and stores highly-scored episodes into a small ranking buffer. IEWD treats these highly-scored episodes as good exploration episodes and makes the deep RL agent imitate exploration behaviors from the ranking buffer to reproduce exploration behaviors from good exploration episodes. Additionally, IEWD adopts the experience replay buffer to store generated positive episodes and demonstrations and employs self-imitating learning to utilize experiences from the experience replay buffer to optimize the policy of the deep RL agent. We evaluate our method IEWD on several procedurally-generated MiniGrid environments and 3-D maze environments from MiniWorld. The results show that IEWD significantly outperforms existing learning from demonstration methods and exploration methods, including state-of-the-art imitation-learning-based exploration methods, in terms of sample efficiency and final performance in complex procedurally-generated environments.","PeriodicalId":55977,"journal":{"name":"IEEE Transactions on Games","volume":"16 3","pages":"530-543"},"PeriodicalIF":1.7,"publicationDate":"2023-07-31","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"62570339","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
0
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
确定
请完成安全验证×
相关产品
×
本文献相关产品
联系我们:info@booksci.cn Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。 Copyright © 2023 布克学术 All rights reserved.
京ICP备2023020795号-1
ghs 京公网安备 11010802042870号
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术官方微信