用于提高离线强化学习性能的选择性数据增强

2022 22nd International Conference on Control, Automation and Systems (ICCAS) Pub Date : 2022-11-27 DOI:10.23919/ICCAS55662.2022.10003747

Jungwook Han, Jinwhan Kim

{"title":"用于提高离线强化学习性能的选择性数据增强","authors":"Jungwook Han, Jinwhan Kim","doi":"10.23919/ICCAS55662.2022.10003747","DOIUrl":null,"url":null,"abstract":"This study proposes a new data augmentation technique for offline reinforcement learning (RL). Rather than randomly choosing data points to carry out the data augmentation, our methodology selectively chooses data from sparse subspaces of the dataset to effectively augment the data region that is insufficient in the original dataset. For the augmentation, the subspaces of the dataset would be represented in the latent space created by the variational autoencoder (VAE). Data is then sampled from the latent space and converted back to the original space by using the decoder of the VAE so that the augmented data can be added to the original dataset. By using the VAE, virtual data that does not severely deviate from the original data could be generated because the VAE creates new data points by using the latent space that captures the original data distribution. We evaluate the performance of our methodology using several offline RL datasets generated from OpenAI Gym benchmark control simulations which mainly use state-based inputs.","PeriodicalId":129856,"journal":{"name":"2022 22nd International Conference on Control, Automation and Systems (ICCAS)","volume":"40 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2022-11-27","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"1","resultStr":"{\"title\":\"Selective Data Augmentation for Improving the Performance of Offline Reinforcement Learning\",\"authors\":\"Jungwook Han, Jinwhan Kim\",\"doi\":\"10.23919/ICCAS55662.2022.10003747\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"This study proposes a new data augmentation technique for offline reinforcement learning (RL). Rather than randomly choosing data points to carry out the data augmentation, our methodology selectively chooses data from sparse subspaces of the dataset to effectively augment the data region that is insufficient in the original dataset. For the augmentation, the subspaces of the dataset would be represented in the latent space created by the variational autoencoder (VAE). Data is then sampled from the latent space and converted back to the original space by using the decoder of the VAE so that the augmented data can be added to the original dataset. By using the VAE, virtual data that does not severely deviate from the original data could be generated because the VAE creates new data points by using the latent space that captures the original data distribution. We evaluate the performance of our methodology using several offline RL datasets generated from OpenAI Gym benchmark control simulations which mainly use state-based inputs.\",\"PeriodicalId\":129856,\"journal\":{\"name\":\"2022 22nd International Conference on Control, Automation and Systems (ICCAS)\",\"volume\":\"40 1\",\"pages\":\"0\"},\"PeriodicalIF\":0.0000,\"publicationDate\":\"2022-11-27\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"1\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"2022 22nd International Conference on Control, Automation and Systems (ICCAS)\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/10.23919/ICCAS55662.2022.10003747\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"2022 22nd International Conference on Control, Automation and Systems (ICCAS)","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.23919/ICCAS55662.2022.10003747","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}

引用次数: 1

摘要

本研究提出了一种新的离线强化学习(RL)数据增强技术。我们的方法不是随机选择数据点进行数据增强，而是有选择地从数据集的稀疏子空间中选择数据，从而有效地增强原始数据集中不足的数据区域。对于增广，数据集的子空间将在变分自编码器(VAE)创建的潜在空间中表示。然后使用VAE解码器从潜空间中采样数据并将其转换回原始空间，从而将增强后的数据添加到原始数据集中。通过使用VAE，可以生成不严重偏离原始数据的虚拟数据，因为VAE利用捕获原始数据分布的潜在空间创建新的数据点。我们使用OpenAI Gym基准控制模拟生成的几个离线RL数据集来评估我们方法的性能，这些数据集主要使用基于状态的输入。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

查看原文本刊更多论文

Selective Data Augmentation for Improving the Performance of Offline Reinforcement Learning

This study proposes a new data augmentation technique for offline reinforcement learning (RL). Rather than randomly choosing data points to carry out the data augmentation, our methodology selectively chooses data from sparse subspaces of the dataset to effectively augment the data region that is insufficient in the original dataset. For the augmentation, the subspaces of the dataset would be represented in the latent space created by the variational autoencoder (VAE). Data is then sampled from the latent space and converted back to the original space by using the decoder of the VAE so that the augmented data can be added to the original dataset. By using the VAE, virtual data that does not severely deviate from the original data could be generated because the VAE creates new data points by using the latent space that captures the original data distribution. We evaluate the performance of our methodology using several offline RL datasets generated from OpenAI Gym benchmark control simulations which mainly use state-based inputs.

求助全文

通过发布文献求助，成功后即可免费获取论文全文。去求助

来源期刊

2022 22nd International Conference on Control, Automation and Systems (ICCAS)

自引率

0.00%

发文量