ACE-RL-Checkers:通过在玩家代理中通过强化学习获得的知识来改进自动案例抽取

2015 IEEE Conference on Computational Intelligence and Games (CIG) Pub Date : 2015-11-05 DOI:10.1109/CIG.2015.7317926

H. C. Neto, Rita Maria Silva Julia

{"title":"ACE-RL-Checkers:通过在玩家代理中通过强化学习获得的知识来改进自动案例抽取","authors":"H. C. Neto, Rita Maria Silva Julia","doi":"10.1109/CIG.2015.7317926","DOIUrl":null,"url":null,"abstract":"This work proposes a new approach that combines Automatic Case Elicitation with Reinforcement Learning applied to Checkers player agents. This type of combination brings forth the following modifications in relation to those agents that use each of these techniques in isolation: improve the random exploration performed by the Automatic Case Elicitation-based agents and introduce adaptability to the Reinforcement Learning-based agents. In line with the above, the authors present herein the ACE-RL-Checkers player agent, a hybrid system that combines the best abilities from the automatic Checkers players CHEBR and LS-VisionDraughts. CHEBR is an Automatic Case Elicitation-based agent with a learning approach that performs random exploration in the search space. These random explorations allow the agent to present an extremely adaptive and non-deterministic behavior. On the other hand, the high frequency at which decisions are made randomly (mainly in those phases in which the content of the case library is still so scarce) compromises the agent in terms of maintaining a good performance. LS-VisionDraughts is a Multi-Layer Perceptron Neural Network player trained through Reinforcement Learning. Besides having been proven efficient in making decisions, such an agent presents an inconvenience in that it is completely predictable, as the same move is always executed when presented with the same board of play. By combining the best abilities from these players, ACE-RL-Checkers uses knowledge provided from LS-VisionDraughts in order to direct random exploration of the automatic case elicitation technique to more promising regions in the search space. Therewith, the ACE-RL-Checkers gains in terms of performance as well as acquires adaptability in its decision-making - choosing moves based on the current game dynamics. Experiments carried out in tournaments involving these agents confirm the performance superiority of ACE-RL-Checkers when pitted against its adversaries.","PeriodicalId":244862,"journal":{"name":"2015 IEEE Conference on Computational Intelligence and Games (CIG)","volume":"79 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2015-11-05","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"2","resultStr":"{\"title\":\"ACE-RL-Checkers: Improving automatic case elicitation through knowledge obtained by reinforcement learning in player agents\",\"authors\":\"H. C. Neto, Rita Maria Silva Julia\",\"doi\":\"10.1109/CIG.2015.7317926\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"This work proposes a new approach that combines Automatic Case Elicitation with Reinforcement Learning applied to Checkers player agents. This type of combination brings forth the following modifications in relation to those agents that use each of these techniques in isolation: improve the random exploration performed by the Automatic Case Elicitation-based agents and introduce adaptability to the Reinforcement Learning-based agents. In line with the above, the authors present herein the ACE-RL-Checkers player agent, a hybrid system that combines the best abilities from the automatic Checkers players CHEBR and LS-VisionDraughts. CHEBR is an Automatic Case Elicitation-based agent with a learning approach that performs random exploration in the search space. These random explorations allow the agent to present an extremely adaptive and non-deterministic behavior. On the other hand, the high frequency at which decisions are made randomly (mainly in those phases in which the content of the case library is still so scarce) compromises the agent in terms of maintaining a good performance. LS-VisionDraughts is a Multi-Layer Perceptron Neural Network player trained through Reinforcement Learning. Besides having been proven efficient in making decisions, such an agent presents an inconvenience in that it is completely predictable, as the same move is always executed when presented with the same board of play. By combining the best abilities from these players, ACE-RL-Checkers uses knowledge provided from LS-VisionDraughts in order to direct random exploration of the automatic case elicitation technique to more promising regions in the search space. Therewith, the ACE-RL-Checkers gains in terms of performance as well as acquires adaptability in its decision-making - choosing moves based on the current game dynamics. Experiments carried out in tournaments involving these agents confirm the performance superiority of ACE-RL-Checkers when pitted against its adversaries.\",\"PeriodicalId\":244862,\"journal\":{\"name\":\"2015 IEEE Conference on Computational Intelligence and Games (CIG)\",\"volume\":\"79 1\",\"pages\":\"0\"},\"PeriodicalIF\":0.0000,\"publicationDate\":\"2015-11-05\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"2\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"2015 IEEE Conference on Computational Intelligence and Games (CIG)\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/10.1109/CIG.2015.7317926\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"2015 IEEE Conference on Computational Intelligence and Games (CIG)","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/CIG.2015.7317926","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}

引用次数: 2

摘要

这项工作提出了一种将自动案例启发与应用于跳棋玩家代理的强化学习相结合的新方法。这种类型的组合对于单独使用这些技术的代理带来了以下修改:改进基于自动案例激发的代理执行的随机探索，并为基于强化学习的代理引入适应性。根据上述情况，作者在此提出了ACE-RL-Checkers玩家代理，这是一个混合系统，结合了自动Checkers玩家CHEBR和ls - visiondrafts的最佳能力。CHEBR是一个基于自动案例抽取的智能体，它采用一种学习方法，在搜索空间中进行随机探索。这些随机探索允许智能体呈现出极具适应性和不确定性的行为。另一方面，随机决策的高频率(主要是在案例库内容仍然稀缺的阶段)会损害智能体在保持良好性能方面的能力。ls - visiondrafts是一个通过强化学习训练的多层感知器神经网络玩家。除了在决策方面被证明是有效的，这样的代理也带来了不便，因为它是完全可预测的，因为在相同的棋盘上总是执行相同的移动。通过结合这些玩家的最佳能力，ACE-RL-Checkers使用ls - visiondrafts提供的知识，以便将自动案例引出技术的随机探索引导到搜索空间中更有前途的区域。因此，ACE-RL-Checkers不仅在性能上有所提高，而且在基于当前博弈动态的决策选择走法方面也获得了适应性。在涉及这些智能体的比赛中进行的实验证实了ACE-RL-Checkers在对抗对手时的性能优势。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

查看原文本刊更多论文

ACE-RL-Checkers: Improving automatic case elicitation through knowledge obtained by reinforcement learning in player agents

This work proposes a new approach that combines Automatic Case Elicitation with Reinforcement Learning applied to Checkers player agents. This type of combination brings forth the following modifications in relation to those agents that use each of these techniques in isolation: improve the random exploration performed by the Automatic Case Elicitation-based agents and introduce adaptability to the Reinforcement Learning-based agents. In line with the above, the authors present herein the ACE-RL-Checkers player agent, a hybrid system that combines the best abilities from the automatic Checkers players CHEBR and LS-VisionDraughts. CHEBR is an Automatic Case Elicitation-based agent with a learning approach that performs random exploration in the search space. These random explorations allow the agent to present an extremely adaptive and non-deterministic behavior. On the other hand, the high frequency at which decisions are made randomly (mainly in those phases in which the content of the case library is still so scarce) compromises the agent in terms of maintaining a good performance. LS-VisionDraughts is a Multi-Layer Perceptron Neural Network player trained through Reinforcement Learning. Besides having been proven efficient in making decisions, such an agent presents an inconvenience in that it is completely predictable, as the same move is always executed when presented with the same board of play. By combining the best abilities from these players, ACE-RL-Checkers uses knowledge provided from LS-VisionDraughts in order to direct random exploration of the automatic case elicitation technique to more promising regions in the search space. Therewith, the ACE-RL-Checkers gains in terms of performance as well as acquires adaptability in its decision-making - choosing moves based on the current game dynamics. Experiments carried out in tournaments involving these agents confirm the performance superiority of ACE-RL-Checkers when pitted against its adversaries.

求助全文

通过发布文献求助，成功后即可免费获取论文全文。去求助

来源期刊

2015 IEEE Conference on Computational Intelligence and Games (CIG)

自引率

0.00%

发文量