A Feature Analysis Tool for Batch RL Datasets

Ruiyang Xu, Zhengxing Chen
{"title":"A Feature Analysis Tool for Batch RL Datasets","authors":"Ruiyang Xu, Zhengxing Chen","doi":"10.1145/3442442.3453147","DOIUrl":null,"url":null,"abstract":"Batch RL is concerned about learning a decision policy from a given dataset without interacting with the environment. Although research is actively conducted on learning-related issues (e.g., convergence speed, stability, and safety), empirical challenges before learning are largely ignored. Many RL practitioners face the challenge of determining whether a designed Markov Decision Process (MDP) is valid and meaningful. This study proposes a model-based method to check whether an MDP designed for a given dataset is well formulated through a heuristic-based feature analysis. We tested our method in constructed as well as more realistic environments. Our results show that our approach can identify potential problems of data. As far as we know, performing validity analysis on batch RL data is a novel direction, and we envision that our tool serves as a motivational example to help practitioners apply RL more easily.","PeriodicalId":129420,"journal":{"name":"Companion Proceedings of the Web Conference 2021","volume":"24 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2021-04-19","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Companion Proceedings of the Web Conference 2021","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1145/3442442.3453147","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 0

Abstract

Batch RL is concerned about learning a decision policy from a given dataset without interacting with the environment. Although research is actively conducted on learning-related issues (e.g., convergence speed, stability, and safety), empirical challenges before learning are largely ignored. Many RL practitioners face the challenge of determining whether a designed Markov Decision Process (MDP) is valid and meaningful. This study proposes a model-based method to check whether an MDP designed for a given dataset is well formulated through a heuristic-based feature analysis. We tested our method in constructed as well as more realistic environments. Our results show that our approach can identify potential problems of data. As far as we know, performing validity analysis on batch RL data is a novel direction, and we envision that our tool serves as a motivational example to help practitioners apply RL more easily.
批量RL数据集的特征分析工具
批处理强化学习关注的是从给定的数据集中学习决策策略,而不与环境交互。尽管对学习相关问题(如收敛速度、稳定性和安全性)的研究非常活跃,但在学习之前的经验挑战在很大程度上被忽视了。许多强化学习从业者面临着确定设计的马尔可夫决策过程(MDP)是否有效和有意义的挑战。本研究提出了一种基于模型的方法,通过基于启发式的特征分析来检查为给定数据集设计的MDP是否制定得很好。我们在构建的和更现实的环境中测试了我们的方法。我们的结果表明,我们的方法可以识别数据的潜在问题。据我们所知,对批量强化学习数据进行有效性分析是一个新颖的方向,我们设想我们的工具可以作为一个激励的例子,帮助从业者更容易地应用强化学习。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
求助全文
约1分钟内获得全文 求助全文
来源期刊
自引率
0.00%
发文量
0
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
确定
请完成安全验证×
copy
已复制链接
快去分享给好友吧!
我知道了
右上角分享
点击右上角分享
0
联系我们:info@booksci.cn Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。 Copyright © 2023 布克学术 All rights reserved.
京ICP备2023020795号-1
ghs 京公网安备 11010802042870号
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术官方微信