将强化学习与监督学习相结合，开发具有人类行为的游戏AI代理

2017 18th IEEE/ACIS International Conference on Software Engineering, Artificial Intelligence, Networking and Parallel/Distributed Computing (SNPD) Pub Date : 2017-06-01 DOI:10.1109/SNPD.2017.8022767

Shohei Miyashita, Xinyu Lian, Xiao Zeng, Takashi Matsubara, K. Uehara

{"title":"将强化学习与监督学习相结合，开发具有人类行为的游戏AI代理","authors":"Shohei Miyashita, Xinyu Lian, Xiao Zeng, Takashi Matsubara, K. Uehara","doi":"10.1109/SNPD.2017.8022767","DOIUrl":null,"url":null,"abstract":"Artificial intelligence (AI) agent created with Deep Q-Networks (DQN) can defeat human agents in video games. Despite its high performance, DQN often exhibits odd behaviors, which could be immersion-breaking against the purpose of creating game AI. Moreover, DQN is capable of reacting to the game environment much faster than humans, making itself invincible (thus not fun to play with) in certain types of games. On the other hand, supervised learning framework trains an AI agent using historical play data of human agents as training data. Supervised learning agent exhibits a more human-like behavior than reinforcement learning agents because of imitating training data. However, its performance is often no better than human agents. The ultimate purpose of AI agents is to entertain human players. A good performance and a humanlike behavior are important factors of the AI agents, and both of them should be achieved simultaneously. This study proposes frameworks combining reinforcement learning and supervised learning and we call then separated network model and shared network model. We evaluated their performances by the game scores and behaviors by Turing test. The experimental results demonstrate that the proposed frameworks develop an AI agent of better performance than human agent and natural behavior than reinforcement learning agents.","PeriodicalId":186094,"journal":{"name":"2017 18th IEEE/ACIS International Conference on Software Engineering, Artificial Intelligence, Networking and Parallel/Distributed Computing (SNPD)","volume":"1 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2017-06-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"10","resultStr":"{\"title\":\"Developing game AI agent behaving like human by mixing reinforcement learning and supervised learning\",\"authors\":\"Shohei Miyashita, Xinyu Lian, Xiao Zeng, Takashi Matsubara, K. Uehara\",\"doi\":\"10.1109/SNPD.2017.8022767\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"Artificial intelligence (AI) agent created with Deep Q-Networks (DQN) can defeat human agents in video games. Despite its high performance, DQN often exhibits odd behaviors, which could be immersion-breaking against the purpose of creating game AI. Moreover, DQN is capable of reacting to the game environment much faster than humans, making itself invincible (thus not fun to play with) in certain types of games. On the other hand, supervised learning framework trains an AI agent using historical play data of human agents as training data. Supervised learning agent exhibits a more human-like behavior than reinforcement learning agents because of imitating training data. However, its performance is often no better than human agents. The ultimate purpose of AI agents is to entertain human players. A good performance and a humanlike behavior are important factors of the AI agents, and both of them should be achieved simultaneously. This study proposes frameworks combining reinforcement learning and supervised learning and we call then separated network model and shared network model. We evaluated their performances by the game scores and behaviors by Turing test. The experimental results demonstrate that the proposed frameworks develop an AI agent of better performance than human agent and natural behavior than reinforcement learning agents.\",\"PeriodicalId\":186094,\"journal\":{\"name\":\"2017 18th IEEE/ACIS International Conference on Software Engineering, Artificial Intelligence, Networking and Parallel/Distributed Computing (SNPD)\",\"volume\":\"1 1\",\"pages\":\"0\"},\"PeriodicalIF\":0.0000,\"publicationDate\":\"2017-06-01\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"10\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"2017 18th IEEE/ACIS International Conference on Software Engineering, Artificial Intelligence, Networking and Parallel/Distributed Computing (SNPD)\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/10.1109/SNPD.2017.8022767\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"2017 18th IEEE/ACIS International Conference on Software Engineering, Artificial Intelligence, Networking and Parallel/Distributed Computing (SNPD)","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/SNPD.2017.8022767","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}

引用次数: 10

摘要

用深度q网络(DQN)创建的人工智能(AI)代理可以在电子游戏中击败人类代理。尽管DQN表现出色，但它经常表现出奇怪的行为，这可能会破坏沉浸感，违背创造游戏AI的目的。此外，DQN能够比人类更快地对游戏环境做出反应，使其在某些类型的游戏中立于不败之地(因此玩起来并不有趣)。另一方面，监督学习框架使用人类智能体的历史游戏数据作为训练数据来训练AI智能体。由于模仿训练数据，监督学习代理比强化学习代理表现出更像人类的行为。然而，它的性能通常并不比人类代理好。AI代理的最终目的是娱乐人类玩家。良好的性能和类似人类的行为是人工智能主体的重要因素，两者应该同时实现。本研究提出了强化学习和监督学习相结合的框架，我们称之为分离网络模型和共享网络模型。我们通过游戏分数和图灵测试来评估他们的表现。实验结果表明，所提出的框架开发的人工智能智能体比人类智能体性能更好，比强化学习智能体行为更自然。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

查看原文本刊更多论文

Developing game AI agent behaving like human by mixing reinforcement learning and supervised learning

Artificial intelligence (AI) agent created with Deep Q-Networks (DQN) can defeat human agents in video games. Despite its high performance, DQN often exhibits odd behaviors, which could be immersion-breaking against the purpose of creating game AI. Moreover, DQN is capable of reacting to the game environment much faster than humans, making itself invincible (thus not fun to play with) in certain types of games. On the other hand, supervised learning framework trains an AI agent using historical play data of human agents as training data. Supervised learning agent exhibits a more human-like behavior than reinforcement learning agents because of imitating training data. However, its performance is often no better than human agents. The ultimate purpose of AI agents is to entertain human players. A good performance and a humanlike behavior are important factors of the AI agents, and both of them should be achieved simultaneously. This study proposes frameworks combining reinforcement learning and supervised learning and we call then separated network model and shared network model. We evaluated their performances by the game scores and behaviors by Turing test. The experimental results demonstrate that the proposed frameworks develop an AI agent of better performance than human agent and natural behavior than reinforcement learning agents.

求助全文

通过发布文献求助，成功后即可免费获取论文全文。去求助

来源期刊

2017 18th IEEE/ACIS International Conference on Software Engineering, Artificial Intelligence, Networking and Parallel/Distributed Computing (SNPD)

自引率

0.00%

发文量