A Progress-Based Algorithm for Interpretable Reinforcement Learning in Regression Testing

IF 2.8 4区计算机科学 Q3 COMPUTER SCIENCE, ARTIFICIAL INTELLIGENCE

IEEE Transactions on Games Pub Date : 2024-07-11 DOI:10.1109/TG.2024.3426601

Pablo Gutiérrez-Sánchez;Marco A. Gómez-Martín;Pedro A. González-Calero;Pedro P. Gómez-Martín

{"title":"A Progress-Based Algorithm for Interpretable Reinforcement Learning in Regression Testing","authors":"Pablo Gutiérrez-Sánchez;Marco A. Gómez-Martín;Pedro A. González-Calero;Pedro P. Gómez-Martín","doi":"10.1109/TG.2024.3426601","DOIUrl":null,"url":null,"abstract":"In video games, the validation of design specifications throughout the development process poses a major challenge as the project grows in complexity and scale and purely manual testing becomes very costly. This article proposes a new approach to design validation regression testing based on a reinforcement learning technique guided by tasks expressed in a formal logic specification language (truncated linear temporal logic) and the progress made in completing these tasks. This requires no prior knowledge of machine learning to train testing bots, is naturally interpretable and debuggable, and produces dense reward functions without the need for reward shaping. We investigate the validity of our strategy by comparing it to an imitation baseline in experiments organized around three use cases of typical scenarios in commercial video games on a 3-D stealth testing environment created in unity. For each scenario, we analyze the agents' reactivity to modifications in common assets to accommodate design needs in other sections of the game, and their ability to report unexpected gameplay variations. Our experiments demonstrate the practicality of our approach for training bots to conduct automated regression testing in complex video game settings.","PeriodicalId":55977,"journal":{"name":"IEEE Transactions on Games","volume":"16 4","pages":"844-853"},"PeriodicalIF":2.8000,"publicationDate":"2024-07-11","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://ieeexplore.ieee.org/stamp/stamp.jsp?tp=&arnumber=10595449","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"IEEE Transactions on Games","FirstCategoryId":"94","ListUrlMain":"https://ieeexplore.ieee.org/document/10595449/","RegionNum":4,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q3","JCRName":"COMPUTER SCIENCE, ARTIFICIAL INTELLIGENCE","Score":null,"Total":0}

引用次数: 0

Abstract

In video games, the validation of design specifications throughout the development process poses a major challenge as the project grows in complexity and scale and purely manual testing becomes very costly. This article proposes a new approach to design validation regression testing based on a reinforcement learning technique guided by tasks expressed in a formal logic specification language (truncated linear temporal logic) and the progress made in completing these tasks. This requires no prior knowledge of machine learning to train testing bots, is naturally interpretable and debuggable, and produces dense reward functions without the need for reward shaping. We investigate the validity of our strategy by comparing it to an imitation baseline in experiments organized around three use cases of typical scenarios in commercial video games on a 3-D stealth testing environment created in unity. For each scenario, we analyze the agents' reactivity to modifications in common assets to accommodate design needs in other sections of the game, and their ability to report unexpected gameplay variations. Our experiments demonstrate the practicality of our approach for training bots to conduct automated regression testing in complex video game settings.

查看原文本刊更多论文

回归测试中可解释强化学习的基于进度的算法

在电子游戏中，随着项目的复杂性和规模的增长，在整个开发过程中验证设计规范是一个重大挑战，纯手动测试变得非常昂贵。本文提出了一种基于强化学习技术的设计验证回归测试的新方法，该技术以形式逻辑规范语言（截断线性时间逻辑）表示的任务为指导，并在完成这些任务方面取得了进展。这不需要机器学习的先验知识来训练测试机器人，自然地可解释和可调试，并且在不需要奖励塑造的情况下产生密集的奖励函数。我们通过将其与在unity中创建的3d隐身测试环境中围绕商业电子游戏中典型场景的三个用例组织的实验中的模仿基线进行比较，来调查我们策略的有效性。对于每个场景，我们分析代理对公共资产修改的反应，以适应游戏其他部分的设计需求，以及他们报告意外玩法变化的能力。我们的实验证明了我们训练机器人在复杂视频游戏设置中进行自动回归测试的方法的实用性。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

求助全文

约1分钟内获得全文求助全文

来源期刊

IEEE Transactions on Games Engineering-Electrical and Electronic Engineering

CiteScore

4.60

自引率

8.70%

发文量