A. Notsu, Katsuhiro Honda, H. Ichihashi, Yuki Komori
{"title":"Simple Reinforcement Learning for Small-Memory Agent","authors":"A. Notsu, Katsuhiro Honda, H. Ichihashi, Yuki Komori","doi":"10.1109/ICMLA.2011.127","DOIUrl":null,"url":null,"abstract":"In this paper, we propose Simple Reinforcement Learning for a reinforcement learning agent that has small memory. In the real world, learning is difficult because there are an infinite number of states and actions that need a large number of stored memories and learning times. To solve a problem, estimated values are categorized as ``GOOD\" or ``NO GOOD\" in the reinforcement learning process. Additionally, the alignment sequence of estimated values is changed because they are regarded as an important sequence themselves. We conducted some simulations and observed the influence of our methods. Several simulation results show no bad influence on learning speed.","PeriodicalId":439926,"journal":{"name":"2011 10th International Conference on Machine Learning and Applications and Workshops","volume":"126 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2011-12-18","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"7","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"2011 10th International Conference on Machine Learning and Applications and Workshops","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/ICMLA.2011.127","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 7
Abstract
In this paper, we propose Simple Reinforcement Learning for a reinforcement learning agent that has small memory. In the real world, learning is difficult because there are an infinite number of states and actions that need a large number of stored memories and learning times. To solve a problem, estimated values are categorized as ``GOOD" or ``NO GOOD" in the reinforcement learning process. Additionally, the alignment sequence of estimated values is changed because they are regarded as an important sequence themselves. We conducted some simulations and observed the influence of our methods. Several simulation results show no bad influence on learning speed.