USAK METHOD FOR THE REINFORCEMENT LEARNING

M. Novotarskyi, V. Kuzmich
{"title":"USAK METHOD FOR THE REINFORCEMENT LEARNING","authors":"M. Novotarskyi, V. Kuzmich","doi":"10.20535/2708-4930.1.2020.216042","DOIUrl":null,"url":null,"abstract":"In the field of reinforcement learning, tabular methods have become widespread. There are many important scientific results, which significantly improve their performance in specific applications. However, the application of tabular methods is limited due to the large amount of resources required to store value functions in tabular form under high-dimensional state spaces. A natural solution to the memory problem is to use parameterized function approximations. However, conventional approaches to function approximations, in most cases, have ceased to give the desired result of memory reduction in solving realworld problems. This fact became the basis for the application of new approaches, one of which is the use of Sparse Distributed Memory (SDM) based on Kanerva coding. A further development of this direction was the method of Similarity-Aware Kanerva (SAK). In this paper, a modification of the SAK method is proposed, the Uniform Similarity-Aware Kanerva (USAK) method, which is based on the uniform distribution of prototypes in the state space. This approach has reduced the use of RAM required to store prototypes. In addition, reducing the receptive distance of each of the prototypes made it possible to increase the learning speed by reducing the number of calculations in the linear approximator.","PeriodicalId":411692,"journal":{"name":"Information, Computing and Intelligent systems","volume":"C-17 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2020-10-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Information, Computing and Intelligent systems","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.20535/2708-4930.1.2020.216042","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 0

Abstract

In the field of reinforcement learning, tabular methods have become widespread. There are many important scientific results, which significantly improve their performance in specific applications. However, the application of tabular methods is limited due to the large amount of resources required to store value functions in tabular form under high-dimensional state spaces. A natural solution to the memory problem is to use parameterized function approximations. However, conventional approaches to function approximations, in most cases, have ceased to give the desired result of memory reduction in solving realworld problems. This fact became the basis for the application of new approaches, one of which is the use of Sparse Distributed Memory (SDM) based on Kanerva coding. A further development of this direction was the method of Similarity-Aware Kanerva (SAK). In this paper, a modification of the SAK method is proposed, the Uniform Similarity-Aware Kanerva (USAK) method, which is based on the uniform distribution of prototypes in the state space. This approach has reduced the use of RAM required to store prototypes. In addition, reducing the receptive distance of each of the prototypes made it possible to increase the learning speed by reducing the number of calculations in the linear approximator.
Usak方法用于强化学习
在强化学习领域,表格方法已经变得非常普遍。有许多重要的科学成果,大大提高了它们在特定应用中的性能。然而,由于在高维状态空间中以表格形式存储值函数需要大量资源,因此表格方法的应用受到限制。内存问题的自然解决方案是使用参数化函数近似。然而,在大多数情况下,传统的函数近似方法在解决实际问题时已经不再提供所需的内存减少结果。这一事实成为应用新方法的基础,其中之一是使用基于Kanerva编码的稀疏分布式内存(SDM)。这个方向的进一步发展是相似感知Kanerva (SAK)方法。本文提出了对SAK方法的一种改进,即基于状态空间中原型的均匀分布的统一相似感知Kanerva (USAK)方法。这种方法减少了存储原型所需的RAM的使用。此外,减少每个原型的接受距离可以通过减少线性逼近器中的计算次数来提高学习速度。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
求助全文
约1分钟内获得全文 求助全文
来源期刊
自引率
0.00%
发文量
0
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
确定
请完成安全验证×
copy
已复制链接
快去分享给好友吧!
我知道了
右上角分享
点击右上角分享
0
联系我们:info@booksci.cn Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。 Copyright © 2023 布克学术 All rights reserved.
京ICP备2023020795号-1
ghs 京公网安备 11010802042870号
Book学术文献互助
Book学术文献互助群
群 号:604180095
Book学术官方微信