USAK METHOD FOR THE REINFORCEMENT LEARNING

Information, Computing and Intelligent systems Pub Date : 2020-10-01 DOI:10.20535/2708-4930.1.2020.216042

M. Novotarskyi, V. Kuzmich

{"title":"USAK METHOD FOR THE REINFORCEMENT LEARNING","authors":"M. Novotarskyi, V. Kuzmich","doi":"10.20535/2708-4930.1.2020.216042","DOIUrl":null,"url":null,"abstract":"In the field of reinforcement learning, tabular methods have become widespread. There are many important scientific results, which significantly improve their performance in specific applications. However, the application of tabular methods is limited due to the large amount of resources required to store value functions in tabular form under high-dimensional state spaces. A natural solution to the memory problem is to use parameterized function approximations. However, conventional approaches to function approximations, in most cases, have ceased to give the desired result of memory reduction in solving realworld problems. This fact became the basis for the application of new approaches, one of which is the use of Sparse Distributed Memory (SDM) based on Kanerva coding. A further development of this direction was the method of Similarity-Aware Kanerva (SAK). In this paper, a modification of the SAK method is proposed, the Uniform Similarity-Aware Kanerva (USAK) method, which is based on the uniform distribution of prototypes in the state space. This approach has reduced the use of RAM required to store prototypes. In addition, reducing the receptive distance of each of the prototypes made it possible to increase the learning speed by reducing the number of calculations in the linear approximator.","PeriodicalId":411692,"journal":{"name":"Information, Computing and Intelligent systems","volume":"C-17 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2020-10-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Information, Computing and Intelligent systems","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.20535/2708-4930.1.2020.216042","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}

引用次数: 0

Abstract

In the field of reinforcement learning, tabular methods have become widespread. There are many important scientific results, which significantly improve their performance in specific applications. However, the application of tabular methods is limited due to the large amount of resources required to store value functions in tabular form under high-dimensional state spaces. A natural solution to the memory problem is to use parameterized function approximations. However, conventional approaches to function approximations, in most cases, have ceased to give the desired result of memory reduction in solving realworld problems. This fact became the basis for the application of new approaches, one of which is the use of Sparse Distributed Memory (SDM) based on Kanerva coding. A further development of this direction was the method of Similarity-Aware Kanerva (SAK). In this paper, a modification of the SAK method is proposed, the Uniform Similarity-Aware Kanerva (USAK) method, which is based on the uniform distribution of prototypes in the state space. This approach has reduced the use of RAM required to store prototypes. In addition, reducing the receptive distance of each of the prototypes made it possible to increase the learning speed by reducing the number of calculations in the linear approximator.

查看原文本刊更多论文

Usak方法用于强化学习

在强化学习领域，表格方法已经变得非常普遍。有许多重要的科学成果，大大提高了它们在特定应用中的性能。然而，由于在高维状态空间中以表格形式存储值函数需要大量资源，因此表格方法的应用受到限制。内存问题的自然解决方案是使用参数化函数近似。然而，在大多数情况下，传统的函数近似方法在解决实际问题时已经不再提供所需的内存减少结果。这一事实成为应用新方法的基础，其中之一是使用基于Kanerva编码的稀疏分布式内存(SDM)。这个方向的进一步发展是相似感知Kanerva (SAK)方法。本文提出了对SAK方法的一种改进，即基于状态空间中原型的均匀分布的统一相似感知Kanerva (USAK)方法。这种方法减少了存储原型所需的RAM的使用。此外，减少每个原型的接受距离可以通过减少线性逼近器中的计算次数来提高学习速度。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

求助全文

约1分钟内获得全文求助全文

来源期刊

Information, Computing and Intelligent systems

自引率

0.00%

发文量