Using Reinforcement Learning to Generate Levels of Super Mario Bros. With Quality and Diversity

IF 2.8 4区计算机科学 Q3 COMPUTER SCIENCE, ARTIFICIAL INTELLIGENCE

IEEE Transactions on Games Pub Date : 2024-06-19 DOI:10.1109/TG.2024.3416472

SangGyu Nam;Chu-Hsuan Hsueh;Pavinee Rerkjirattikal;Kokolo Ikeda

{"title":"Using Reinforcement Learning to Generate Levels of Super Mario Bros. With Quality and Diversity","authors":"SangGyu Nam;Chu-Hsuan Hsueh;Pavinee Rerkjirattikal;Kokolo Ikeda","doi":"10.1109/TG.2024.3416472","DOIUrl":null,"url":null,"abstract":"Procedural content generation (PCG) is essential in game development, automating content creation to meet various criteria such as playability, diversity, and quality. This article leverages reinforcement learning (RL) for PCG to generate \n<italic>Super Mario Bros.</i>\n levels. We formulate the problem into a Markov decision process (MDP), with rewards defined using player enjoyment-based evaluation functions. Challenges in level representation and difficulty assessment are addressed by conditional generative adversarial networks and human-like artificial intelligence agents that mimic aspects of human input inaccuracies. This ensures that the generated levels are appropriately challenging from human perspectives. Furthermore, we enhance content quality through virtual simulation, which assigns rewards to intermediate actions to address a credit assignment problem. We also ensure diversity through a diversity-aware greedy policy, which chooses not-bad-but-distant actions based on \n<inline-formula><tex-math>$Q$</tex-math></inline-formula>\n-values. These processes ensure the production of diverse and high-quality \n<italic>Super Mario</i>\n levels. Human subject evaluations revealed that levels generated from our approach exhibit natural connection, appropriate difficulty, nonmonotony, and diversity, highlighting the effectiveness of our proposed methods. The novelty of our work lies in the innovative solutions we propose to address challenges encountered in employing the PCG via RL method in \n<italic>Super Mario Bros.</i>\n, contributing to the field of PCG for game development.","PeriodicalId":55977,"journal":{"name":"IEEE Transactions on Games","volume":"16 4","pages":"807-820"},"PeriodicalIF":2.8000,"publicationDate":"2024-06-19","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"IEEE Transactions on Games","FirstCategoryId":"94","ListUrlMain":"https://ieeexplore.ieee.org/document/10564107/","RegionNum":4,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q3","JCRName":"COMPUTER SCIENCE, ARTIFICIAL INTELLIGENCE","Score":null,"Total":0}

引用次数: 0

Abstract

Procedural content generation (PCG) is essential in game development, automating content creation to meet various criteria such as playability, diversity, and quality. This article leverages reinforcement learning (RL) for PCG to generate Super Mario Bros. levels. We formulate the problem into a Markov decision process (MDP), with rewards defined using player enjoyment-based evaluation functions. Challenges in level representation and difficulty assessment are addressed by conditional generative adversarial networks and human-like artificial intelligence agents that mimic aspects of human input inaccuracies. This ensures that the generated levels are appropriately challenging from human perspectives. Furthermore, we enhance content quality through virtual simulation, which assigns rewards to intermediate actions to address a credit assignment problem. We also ensure diversity through a diversity-aware greedy policy, which chooses not-bad-but-distant actions based on

$Q$

-values. These processes ensure the production of diverse and high-quality Super Mario levels. Human subject evaluations revealed that levels generated from our approach exhibit natural connection, appropriate difficulty, nonmonotony, and diversity, highlighting the effectiveness of our proposed methods. The novelty of our work lies in the innovative solutions we propose to address challenges encountered in employing the PCG via RL method in Super Mario Bros. , contributing to the field of PCG for game development.

查看原文本刊更多论文

使用强化学习生成高质量和多样化的《超级马里奥兄弟》关卡

程序内容生成（PCG）在游戏开发中至关重要，它能够自动生成内容以满足各种标准，如可玩性、多样性和质量。本文利用PCG的强化学习（RL）来生成《超级马里奥兄弟》关卡。我们将这个问题转化为马尔可夫决策过程（MDP），并使用基于玩家享受的评估函数来定义奖励。通过条件生成对抗网络和类人人工智能代理来解决关卡表示和难度评估方面的挑战，这些人工智能代理模仿人类输入不准确的方面。这确保了生成的关卡从人的角度来看具有适当的挑战性。此外，我们通过虚拟模拟来提高内容质量，虚拟模拟为中间行为分配奖励，以解决信用分配问题。我们还通过多样性意识贪婪政策来确保多样性，该政策根据$Q$值选择不坏但遥远的行动。这些过程确保了制作多样化和高质量的超级马里奥关卡。人类受试者评估显示，我们的方法产生的水平表现出自然的联系，适当的难度，非单调性和多样性，突出了我们提出的方法的有效性。我们工作的新颖之处在于，我们提出了创新的解决方案，以解决在《超级马里奥兄弟》中通过RL方法使用PCG所遇到的挑战，为游戏开发领域的PCG做出了贡献。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

求助全文

约1分钟内获得全文求助全文

来源期刊

IEEE Transactions on Games Engineering-Electrical and Electronic Engineering

CiteScore

4.60

自引率

8.70%

发文量