{"title":"A New Learning Algorithm for the Maxq Hierarchical Reinforcement Learning Method","authors":"F. Mirzazadeh, B. Behsaz, H. Beigy","doi":"10.1109/ICICT.2007.375352","DOIUrl":null,"url":null,"abstract":"The MAXQ hierarchical reinforcement learning method is computationally expensive in applications with deep hierarchy. In this paper, we propose a new learning algorithm for MAXQ method to address the open problem of reducing its computational complexity. While the computational cost of the algorithm is considerably decreased, the required storage of new algorithm is less than two times as the original learning algorithm requires storage. Our experimental results in the simple taxi domain problem show satisfactory behavior of the new algorithm.","PeriodicalId":206443,"journal":{"name":"2007 International Conference on Information and Communication Technology","volume":"117 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2007-03-07","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"9","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"2007 International Conference on Information and Communication Technology","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/ICICT.2007.375352","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 9
Abstract
The MAXQ hierarchical reinforcement learning method is computationally expensive in applications with deep hierarchy. In this paper, we propose a new learning algorithm for MAXQ method to address the open problem of reducing its computational complexity. While the computational cost of the algorithm is considerably decreased, the required storage of new algorithm is less than two times as the original learning algorithm requires storage. Our experimental results in the simple taxi domain problem show satisfactory behavior of the new algorithm.