{"title":"Automatic Discovery and Transfer of MAXQ Hierarchies in a Complex System","authors":"Hongbing Wang, Wenya Li, Xuan Zhou","doi":"10.1109/ICTAI.2012.165","DOIUrl":null,"url":null,"abstract":"Reinforcement learning has been an important category of machine learning approaches exhibiting self-learning and online learning characteristics. Using reinforcement learning, an agent can learn its behaviors through trial-and-error interactions with a dynamic environment and finally come up with an optimal strategy. Reinforcement learning suffers the curse of dimensionality, though there has been significant progress to overcome this issue in recent years. MAXQ is one of the most common approaches for reinforcement learning. To function properly, MAXQ requires a decomposition of the agent's task into a task hierarchy. Previously, the decomposition can only be done manually. In this paper, we propose a mechanism for automatic subtask discovery. The mechanism applies clustering to automatically construct task hierarchy required by MAXQ, such that MAXQ can be fully automated. We present the design of our mechanism, and demonstrate its effectiveness through theoretical analysis and an extensive experimental evaluation.","PeriodicalId":155588,"journal":{"name":"2012 IEEE 24th International Conference on Tools with Artificial Intelligence","volume":"1 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2012-11-07","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"2","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"2012 IEEE 24th International Conference on Tools with Artificial Intelligence","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/ICTAI.2012.165","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 2
Abstract
Reinforcement learning has been an important category of machine learning approaches exhibiting self-learning and online learning characteristics. Using reinforcement learning, an agent can learn its behaviors through trial-and-error interactions with a dynamic environment and finally come up with an optimal strategy. Reinforcement learning suffers the curse of dimensionality, though there has been significant progress to overcome this issue in recent years. MAXQ is one of the most common approaches for reinforcement learning. To function properly, MAXQ requires a decomposition of the agent's task into a task hierarchy. Previously, the decomposition can only be done manually. In this paper, we propose a mechanism for automatic subtask discovery. The mechanism applies clustering to automatically construct task hierarchy required by MAXQ, such that MAXQ can be fully automated. We present the design of our mechanism, and demonstrate its effectiveness through theoretical analysis and an extensive experimental evaluation.