Learning From Big Data: A Survey and Evaluation of Approximation Technologies for Large-Scale Reinforcement Learning

Cheng Wu, Yiming Wang
{"title":"Learning From Big Data: A Survey and Evaluation of Approximation Technologies for Large-Scale Reinforcement Learning","authors":"Cheng Wu, Yiming Wang","doi":"10.1109/CIT.2017.11","DOIUrl":null,"url":null,"abstract":"A key problem in large-scale reinforcement learning is to deal with big data, in terms of a very large number of environment states and many possible actions. Function approximation can improve the ability of a reinforcement learner to solve large-scale problems. Tile coding and Kanerva coding are two classical methods for implementing function approximation, but these methods may give poor performance when applied to large-scale, high-dimensional instances. In the paper, we evaluate a collection of hard instances of the predator-prey pursuit problem, a classic reinforcement learning platform with scalable state-action space, to compare these two methods and their optimization techniques. We first show that Kanerva coding gives better results than Tile coding when the dimension of the instances increases. We then describe a feature optimization mechanism and show that it can increase the fraction of instances that are solved by both Tile coding and Kanerva coding. Finally, we demonstrate that a fuzzy approach to function approximation can further increase the fraction of instances. We show that our fuzzy approach to Kanerva coding outperforms fuzzy Tile coding when feature optimization is applied. We conclude that discrete and fuzzy Kanerva coding represent powerful function approximation techniques that can outperform discrete and fuzzy Tile coding on large-scale, high-dimensional learning problems.","PeriodicalId":378423,"journal":{"name":"2017 IEEE International Conference on Computer and Information Technology (CIT)","volume":"36 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2017-08-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"2","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"2017 IEEE International Conference on Computer and Information Technology (CIT)","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/CIT.2017.11","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 2

Abstract

A key problem in large-scale reinforcement learning is to deal with big data, in terms of a very large number of environment states and many possible actions. Function approximation can improve the ability of a reinforcement learner to solve large-scale problems. Tile coding and Kanerva coding are two classical methods for implementing function approximation, but these methods may give poor performance when applied to large-scale, high-dimensional instances. In the paper, we evaluate a collection of hard instances of the predator-prey pursuit problem, a classic reinforcement learning platform with scalable state-action space, to compare these two methods and their optimization techniques. We first show that Kanerva coding gives better results than Tile coding when the dimension of the instances increases. We then describe a feature optimization mechanism and show that it can increase the fraction of instances that are solved by both Tile coding and Kanerva coding. Finally, we demonstrate that a fuzzy approach to function approximation can further increase the fraction of instances. We show that our fuzzy approach to Kanerva coding outperforms fuzzy Tile coding when feature optimization is applied. We conclude that discrete and fuzzy Kanerva coding represent powerful function approximation techniques that can outperform discrete and fuzzy Tile coding on large-scale, high-dimensional learning problems.
从大数据中学习:大规模强化学习的近似技术综述与评价
大规模强化学习的一个关键问题是处理大数据,涉及大量的环境状态和许多可能的动作。函数逼近可以提高强化学习器解决大规模问题的能力。Tile编码和Kanerva编码是实现函数逼近的两种经典方法,但这些方法在应用于大规模、高维实例时可能会产生较差的性能。在本文中,我们评估了具有可扩展状态-动作空间的经典强化学习平台——捕食者-猎物追逐问题的一组困难实例,以比较这两种方法及其优化技术。我们首先表明,当实例的维数增加时,Kanerva编码比Tile编码得到更好的结果。然后,我们描述了一种特征优化机制,并表明它可以增加由Tile编码和Kanerva编码解决的实例的比例。最后,我们证明了模糊逼近函数的方法可以进一步增加实例的比例。我们表明,当应用特征优化时,我们的模糊方法对Kanerva编码优于模糊Tile编码。我们得出结论,离散和模糊Kanerva编码代表了强大的函数逼近技术,可以在大规模,高维学习问题上优于离散和模糊Tile编码。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
求助全文
约1分钟内获得全文 求助全文
来源期刊
自引率
0.00%
发文量
0
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
确定
请完成安全验证×
copy
已复制链接
快去分享给好友吧!
我知道了
右上角分享
点击右上角分享
0
联系我们:info@booksci.cn Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。 Copyright © 2023 布克学术 All rights reserved.
京ICP备2023020795号-1
ghs 京公网安备 11010802042870号
Book学术文献互助
Book学术文献互助群
群 号:604180095
Book学术官方微信