IEEE Transactions on Games最新文献_第6页

Fun as Moderate Divergence: Evaluating Experience-Driven PCG via RL 适度发散的乐趣：通过 RL 评估经验驱动的 PCG

IF 1.7 4区计算机科学

IEEE Transactions on Games Pub Date : 2024-09-09 DOI: 10.1109/TG.2024.3456101

Ziqi Wang;Yuchen Li;Haocheng Du;Jialin Liu;Georgios N. Yannakakis

{"title":"Fun as Moderate Divergence: Evaluating Experience-Driven PCG via RL","authors":"Ziqi Wang;Yuchen Li;Haocheng Du;Jialin Liu;Georgios N. Yannakakis","doi":"10.1109/TG.2024.3456101","DOIUrl":"10.1109/TG.2024.3456101","url":null,"abstract":"The computational modeling of player experience is key to the generation of personalized game content. The notion of <italic>fun, as one of the most peculiar and core aspects of game experience, has often been modeled and quantified for the purpose of content generation with varying success. Recently, measures of a player's <italic>fun have been ad-hoc designed to model moderate levels of in-game divergence in platformer games, inspired by Koster's <italic>theory of fun. Such measures have shaped the reward functions of game content generative methods following the <italic>experience-driven procedural content generation via reinforcement learning (EDRL) paradigm in <italic>Super Mario Bros In this article, we present a comprehensive user study involving over 90 participants with a dual purpose: to evaluate the ad-hoc <italic>fun metrics introduced in the literature and test the effectiveness of the EDRL framework to generate personalized <italic>fun <italic>Super Mario Bros experiences in an online fashion. Our key findings suggest that moderate degrees of game level and gameplay divergence are highly consistent with the perceived notion of <italic>fun of our participants, cross-verifying the ad-hoc designed <italic>fun metrics. On the other hand, it appears that EDRL generators manage to match the preferred (i.e., <italic>fun) game experiences of each persona, only in part and for some players. Our findings suggest that the use of multifaceted in-game data, such as events and actions, will likely enable the modeling of more nuanced gameplay behaviors. In addition, the verification of player persona modeling and the enhancement of player engagement through dynamic experience modelling are suggested as potential future directions.","PeriodicalId":55977,"journal":{"name":"IEEE Transactions on Games","volume":"17 2","pages":"360-373"},"PeriodicalIF":1.7,"publicationDate":"2024-09-09","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142218413","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 0

Efficient Adversarially Guided Actor–Critic 高效的逆向引导演员批评家

IF 1.7 4区计算机科学

IEEE Transactions on Games Pub Date : 2024-09-03 DOI: 10.1109/TG.2024.3453444

Mao Xu;Shuzhi Sam Ge;Dongjie Zhao;Qian Zhao

{"title":"Efficient Adversarially Guided Actor–Critic","authors":"Mao Xu;Shuzhi Sam Ge;Dongjie Zhao;Qian Zhao","doi":"10.1109/TG.2024.3453444","DOIUrl":"10.1109/TG.2024.3453444","url":null,"abstract":"Exploring procedurally-generated environments presents a formidable challenge in model-free deep reinforcement learning (RL). One state-of-the-art exploration method, adversarially guided actor–critic (AGAC), employs adversarial learning to drive exploration by diversifying the actions of the deep RL agent. Specifically, in the actor–critic (AC) framework, which consists of a policy (the actor) and a value function (the critic), AGAC introduces an adversary that mimics the actor. AGAC then constructs an action-based adversarial advantage (ABAA) to update the actor. This ABAA guides the deep RL agent toward actions that diverge from the adversary's predictions while maximizing expected returns. Although the ABAA drives AGAC to explore procedurally-generated environments, it can affect the balance between exploration and exploitation during the training period, thereby impairing AGAC's performance. To mitigate this adverse effect and improve AGAC's performance, we propose efficient adversarially guided actor–critic (EAGAC). EAGAC introduces a state-based adversarial advantage (SBAA) that directs the deep RL agent toward actions leading to states with different action distributions from those of the adversary while maximizing expected returns. EAGAC combines this SBAA with the ABAA to form a joint adversarial advantage, and then employs this joint adversarial advantage to update the actor. To further reduce this adverse effect and enhance performance, EAGAC stores past positive episodes in the replay buffer and utilizes experiences sampled from this buffer to optimize the actor through self-imitation learning (SIL). The experimental results in procedurally-generated environments from MiniGrid and the 3-D navigation environment from ViZDoom show our EAGAC method significantly outperforms AGAC and other state-of-the-art exploration methods in both sample efficiency and final performance.","PeriodicalId":55977,"journal":{"name":"IEEE Transactions on Games","volume":"17 2","pages":"346-359"},"PeriodicalIF":1.7,"publicationDate":"2024-09-03","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142218415","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 0

Learning Individual Potential-Based Rewards in Multiagent Reinforcement Learning 在多代理强化学习中学习基于个人潜能的奖励

IF 1.7 4区计算机科学

IEEE Transactions on Games Pub Date : 2024-08-29 DOI: 10.1109/TG.2024.3450475

Chen Yang;Pei Xu;Junge Zhang

{"title":"Learning Individual Potential-Based Rewards in Multiagent Reinforcement Learning","authors":"Chen Yang;Pei Xu;Junge Zhang","doi":"10.1109/TG.2024.3450475","DOIUrl":"10.1109/TG.2024.3450475","url":null,"abstract":"A great challenge for applying multiagent reinforcement learning (MARL) in the field of game artificial intelligence (AI) is to enable agents to learn diversified policies to handle different game-specific problems, while receiving only a shared team reward. At present, a common approach is reward shaping, which focuses on designing rewards for agents to guide cooperation. However, most of the existing methods require prior knowledge on the environment for reward design or alter the optimal policies after imposing extra rewards. Besides, previous MARL methods that rely on manually designed rewards can hardly generalize across different game environments. To this end, we propose a new MARL method that learns individual potential-based rewards for agents. Specifically, we learn a parameterized potential function for each agent to generate individual rewards in the discounted temporal difference form. The whole update procedure is modeled as the bilevel optimization problem, where the lower level is to optimize policies with potential-based rewards, and the upper level is to optimize parameterized potential functions toward maximizing the environment return. We theoretically prove that the individual potential-based rewards can guarantee policy invariance for agents, so that the optimization objective is consistent with the original MARL problem. We evaluate our method with a number of existing state-of-the-art MARL methods on predator–prey and <italic>StarCraft II game environments. Empirical results show that our proposed method significantly outperforms baseline methods and achieves better game AI that enjoys high performance and generalization.","PeriodicalId":55977,"journal":{"name":"IEEE Transactions on Games","volume":"17 2","pages":"334-345"},"PeriodicalIF":1.7,"publicationDate":"2024-08-29","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142218416","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 0

STEP: A Framework for Automated Point Cost Estimation STEP：点成本自动估算框架

IF 1.7 4区计算机科学

IEEE Transactions on Games Pub Date : 2024-08-28 DOI: 10.1109/TG.2024.3450992

George E.M. Long;Diego Perez-Liebana;Spyridon Samothrakis

{"title":"STEP: A Framework for Automated Point Cost Estimation","authors":"George E.M. Long;Diego Perez-Liebana;Spyridon Samothrakis","doi":"10.1109/TG.2024.3450992","DOIUrl":"10.1109/TG.2024.3450992","url":null,"abstract":"In miniature wargames, such as \u0000<italic>Warhammer 40k\u0000, players control asymmetrical armies, which include multiple units of different types and strengths. These games often use point costs to balance the armies. Each unit is assigned a point cost, and players have a budget they can spend on units. Calculating accurate point costs can be a tedious manual process, with iterative playtests required. If these point costs do not represent a units true power, the game can get unbalanced as overpowered units can have low point costs. In our previous paper, we proposed an automated way of estimating the point costs using a linear regression approach. We used a turn-based asymmetrical wargame called \u0000<italic>Wizard Wars\u0000 to test our methods. Players were simulated using Monte Carlo tree search, using different heuristics to represent playstyles. We presented six variants of our method, and show that one method was able to reduce the unbalanced nature of the game by almost half. For this article, we introduce a framework called simple testing and evaluation of points, which allows for further and more granular analysis of point cost estimating methods, by providing a fast, simple, and configurable framework to test methods with. Finally, we compare how our methods do in \u0000<italic>Wizard Wars\u0000 against expertly chosen point costs.","PeriodicalId":55977,"journal":{"name":"IEEE Transactions on Games","volume":"16 4","pages":"927-936"},"PeriodicalIF":1.7,"publicationDate":"2024-08-28","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142218210","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 0

Improving Conditional Level Generation Using Automated Validation in Match-3 Games 利用自动验证改进匹配-3 游戏中的条件关卡生成

IF 1.7 4区计算机科学

IEEE Transactions on Games Pub Date : 2024-08-07 DOI: 10.1109/TG.2024.3440214

Monica Villanueva Aylagas;Joakim Bergdahl;Jonas Gillberg;Alessandro Sestini;Theodor Tolstoy;Linus Gisslén

{"title":"Improving Conditional Level Generation Using Automated Validation in Match-3 Games","authors":"Monica Villanueva Aylagas;Joakim Bergdahl;Jonas Gillberg;Alessandro Sestini;Theodor Tolstoy;Linus Gisslén","doi":"10.1109/TG.2024.3440214","DOIUrl":"10.1109/TG.2024.3440214","url":null,"abstract":"Generative models for level generation have shown great potential in game production. However, they often provide limited control over the generation, and the validity of the generated levels is unreliable. Despite this fact, only a few approaches that learn from existing data provide the users with ways of controlling the generation, simultaneously addressing the generation of unsolvable levels. This article proposes autovalidated level generation, a novel method to improve models that learn from existing level designs using difficulty statistics extracted from gameplay. In particular, we use a conditional variational autoencoder to generate layouts for match-3 levels, conditioning the model on precollected statistics, such as game mechanics like difficulty, and relevant visual features, such as size and symmetry. Our method is general enough that multiple approaches could potentially be used to generate these statistics. We quantitatively evaluate our approach by comparing it to an ablated model without difficulty conditioning. In addition, we analyze both quantitatively and qualitatively whether the style of the dataset is preserved in the generated levels. Our approach generates more valid levels than the same method without difficulty conditioning.","PeriodicalId":55977,"journal":{"name":"IEEE Transactions on Games","volume":"16 4","pages":"783-792"},"PeriodicalIF":1.7,"publicationDate":"2024-08-07","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"141944749","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 0

Relation-Aware Learning for Multitask Multiagent Cooperative Games 多任务多代理合作游戏的关系感知学习

IF 1.7 4区计算机科学

IEEE Transactions on Games Pub Date : 2024-08-02 DOI: 10.1109/TG.2024.3436871

Yang Yu;Likun Yang;Zhourui Guo;Yongjian Ren;Qiyue Yin;Junge Zhang;Kaiqi Huang

{"title":"Relation-Aware Learning for Multitask Multiagent Cooperative Games","authors":"Yang Yu;Likun Yang;Zhourui Guo;Yongjian Ren;Qiyue Yin;Junge Zhang;Kaiqi Huang","doi":"10.1109/TG.2024.3436871","DOIUrl":"10.1109/TG.2024.3436871","url":null,"abstract":"Collaboration among multiple tasks is advantageous for enhancing learning efficiency in multiagent reinforcement learning. To guide agents in cooperating with different teammates in multiple tasks, contemporary approaches encourage agents to exploit common cooperative patterns or identify the learning priorities of multiple tasks. Despite the progress made by these methods, they all assume that all cooperative tasks to be learned are related and desire similar agent policies. This is rarely the case in multiagent cooperation, where minor changes in team composition can lead to significant variations in cooperation, resulting in distinct cooperative strategies compete for limited learning resources. In this article, to tackle the challenge posed by multitask learning in potentially competing cooperative tasks, we propose a novel framework called relation-aware learning (RAL). RAL incorporates a relation awareness module in both task representation and task optimization, aiding in reasoning about task relationships and mitigating negative transfers among dissimilar tasks. To assess the performance of RAL, we conduct a comparative analysis with baseline methods in a multitask <italic>StarCraft environment. The results demonstrate the superiority of RAL in multitask cooperative scenarios, particularly in scenarios involving multiple conflicting tasks.","PeriodicalId":55977,"journal":{"name":"IEEE Transactions on Games","volume":"17 2","pages":"322-333"},"PeriodicalIF":1.7,"publicationDate":"2024-08-02","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"141882419","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 0

Detecting Discrepancies Between Subtitles and Audio in Gameplay Videos With EchoTest 利用 EchoTest 检测游戏视频中字幕与音频之间的差异

IF 1.7 4区计算机科学

IEEE Transactions on Games Pub Date : 2024-07-30 DOI: 10.1109/TG.2024.3435799

Ian Gauk;Cor-Paul Bezemer

引用次数: 0

Biosignal Contrastive Representation Learning for Emotion Recognition of Game Users 用于游戏用户情绪识别的生物信号对比表征学习

IF 1.7 4区计算机科学

IEEE Transactions on Games Pub Date : 2024-07-30 DOI: 10.1109/TG.2024.3435339

Rongyang Li;Jianguo Ding;Huansheng Ning

{"title":"Biosignal Contrastive Representation Learning for Emotion Recognition of Game Users","authors":"Rongyang Li;Jianguo Ding;Huansheng Ning","doi":"10.1109/TG.2024.3435339","DOIUrl":"10.1109/TG.2024.3435339","url":null,"abstract":"Biosignal representation learning (BRL) plays a crucial role in emotion recognition for game users (ERGU). Unsupervised BRL has garnered attention considering the difficulty in obtaining ground truth emotion labels from game users. However, unsupervised BRL in ERGU faces challenges, including overfitting caused by limited data and performance degradation due to unbalanced sample distributions. Faced with the above challenges, we propose a novel method of biosignal contrastive representation learning (BCRL) for ERGU, which not only serves as a unified representation learning approach applicable to various modalities of biosignals but also derives generalized biosignals representations suitable for different downstream tasks. Specifically, we solve the overfitting by introducing perturbations at the embedding layer based on the projected gradient descent (PGD) adversarial attacks and develop the sample balancing strategy (SBS) to mitigate the negative impact of the unbalanced sample on the performance. Further, we have conducted comprehensive validation experiments on the public dataset, yielding the following key observations: first BCRL outperforms all other methods, achieving average accuracies of 76.67%, 71.83%, and 63.58% in 1D-2 C Valence, 1D-2 C Arousal, and 2D-4 C Valence/Arousal, respectively; second, the ablation study shows that both the PGD module (+7.58% in accuracy on average) and the SBS module (+14.60% in accuracy on average) have a positive effect on the performance of different classifications; third, BCRL model exhibits the certain generalization ability across the different games, subjects and classifiers.","PeriodicalId":55977,"journal":{"name":"IEEE Transactions on Games","volume":"17 2","pages":"308-321"},"PeriodicalIF":1.7,"publicationDate":"2024-07-30","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"141863122","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 0

More Human-Like Gameplay by Blending Policies From Supervised and Reinforcement Learning 通过融合监督学习和强化学习的政策，让游戏玩法更接近人类

IF 1.7 4区计算机科学

IEEE Transactions on Games Pub Date : 2024-07-11 DOI: 10.1109/TG.2024.3424668

Tatsuyoshi Ogawa;Chu-Hsuan Hsueh;Kokolo Ikeda

{"title":"More Human-Like Gameplay by Blending Policies From Supervised and Reinforcement Learning","authors":"Tatsuyoshi Ogawa;Chu-Hsuan Hsueh;Kokolo Ikeda","doi":"10.1109/TG.2024.3424668","DOIUrl":"10.1109/TG.2024.3424668","url":null,"abstract":"Modeling human players' behaviors in games is a key challenge for making natural computer players, evaluating games, and generating content. To achieve better human–computer interaction, researchers have tried various methods to create human-like artificial intelligence. In chess and \u0000<italic>Go\u0000, supervised learning with deep neural networks is known as one of the most effective ways to predict human moves. However, for many other games (e.g., \u0000<italic>Shogi\u0000), it is hard to collect a similar amount of game records, resulting in poor move-matching accuracy of the supervised learning. We propose a method to compensate for the weakness of the supervised learning policy by Blending it with an AlphaZero-like reinforcement learning policy. Experiments on \u0000<italic>Shogi\u0000 showed that the Blend method significantly improved the move-matching accuracy over supervised learning models. Experiments on chess and \u0000<italic>Go\u0000 with a limited number of game records also showed similar results. The Blend method was effective with both medium and large numbers of games, particularly the medium case. We confirmed the robustness of the Blend model to the parameter and discussed the mechanism why the move-matching accuracy improves. In addition, we showed that the Blend model performed better than existing work that tried to improve the move-matching accuracy.","PeriodicalId":55977,"journal":{"name":"IEEE Transactions on Games","volume":"16 4","pages":"831-843"},"PeriodicalIF":1.7,"publicationDate":"2024-07-11","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://ieeexplore.ieee.org/stamp/stamp.jsp?tp=&arnumber=10595450","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"141611394","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 0

A Progress-Based Algorithm for Interpretable Reinforcement Learning in Regression Testing 回归测试中可解释强化学习的基于进度的算法

IF 1.7 4区计算机科学

IEEE Transactions on Games Pub Date : 2024-07-11 DOI: 10.1109/TG.2024.3426601

Pablo Gutiérrez-Sánchez;Marco A. Gómez-Martín;Pedro A. González-Calero;Pedro P. Gómez-Martín

{"title":"A Progress-Based Algorithm for Interpretable Reinforcement Learning in Regression Testing","authors":"Pablo Gutiérrez-Sánchez;Marco A. Gómez-Martín;Pedro A. González-Calero;Pedro P. Gómez-Martín","doi":"10.1109/TG.2024.3426601","DOIUrl":"10.1109/TG.2024.3426601","url":null,"abstract":"In video games, the validation of design specifications throughout the development process poses a major challenge as the project grows in complexity and scale and purely manual testing becomes very costly. This article proposes a new approach to design validation regression testing based on a reinforcement learning technique guided by tasks expressed in a formal logic specification language (truncated linear temporal logic) and the progress made in completing these tasks. This requires no prior knowledge of machine learning to train testing bots, is naturally interpretable and debuggable, and produces dense reward functions without the need for reward shaping. We investigate the validity of our strategy by comparing it to an imitation baseline in experiments organized around three use cases of typical scenarios in commercial video games on a 3-D stealth testing environment created in unity. For each scenario, we analyze the agents' reactivity to modifications in common assets to accommodate design needs in other sections of the game, and their ability to report unexpected gameplay variations. Our experiments demonstrate the practicality of our approach for training bots to conduct automated regression testing in complex video game settings.","PeriodicalId":55977,"journal":{"name":"IEEE Transactions on Games","volume":"16 4","pages":"844-853"},"PeriodicalIF":1.7,"publicationDate":"2024-07-11","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://ieeexplore.ieee.org/stamp/stamp.jsp?tp=&arnumber=10595449","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"141614866","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 0