{"title":"Qualitative reinforcement learning to accelerate finding an optimal policy","authors":"Fatemeh Telgerdi, A. Khalilian, A. Pouyan","doi":"10.1109/ICCKE.2014.6993424","DOIUrl":null,"url":null,"abstract":"Reinforcement Learning (RL) has been known as a popular area of machine learning in which the autonomous agent improves its behavior using interactions with the environment. The problem though is that this process is often time consuming, costly and achieving an optimal policy might be rather slow. One way to alleviate this problem is qualitative learning by providing some initial knowledge from the environment for the agent. In this paper, a new algorithm has been introduced based on qualitative learning that aggregates states after some early episodes of learning. The learning then continues on the new qualitative environment. In order to evaluate the proposed algorithm, experiments on two benchmark environments have been conducted. The obtained results demonstrate the effectiveness of the new algorithm in accelerating the learning process.","PeriodicalId":152540,"journal":{"name":"2014 4th International Conference on Computer and Knowledge Engineering (ICCKE)","volume":"149 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2014-12-22","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"1","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"2014 4th International Conference on Computer and Knowledge Engineering (ICCKE)","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/ICCKE.2014.6993424","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 1
Abstract
Reinforcement Learning (RL) has been known as a popular area of machine learning in which the autonomous agent improves its behavior using interactions with the environment. The problem though is that this process is often time consuming, costly and achieving an optimal policy might be rather slow. One way to alleviate this problem is qualitative learning by providing some initial knowledge from the environment for the agent. In this paper, a new algorithm has been introduced based on qualitative learning that aggregates states after some early episodes of learning. The learning then continues on the new qualitative environment. In order to evaluate the proposed algorithm, experiments on two benchmark environments have been conducted. The obtained results demonstrate the effectiveness of the new algorithm in accelerating the learning process.