{"title":"A Robust Offline Reinforcement Learning Algorithm Based on Behavior Regularization Methods","authors":"Yan Zhang, Tianhan Gao, Qingwei Mi","doi":"10.1109/IAICT55358.2022.9887435","DOIUrl":null,"url":null,"abstract":"Offline deep reinforcement learning algorithms are still in developing. Some existing algorithms have shown that it is feasible to learn directly without using environmental interaction under the condition of sufficient datasets. In this paper, we combine an offline reinforcement learning method through behavior regularization with a robust offline reinforcement learning algorithm. Moreover, the algorithm is verified and analyzed with a high-quality but limited dataset. The experimental results show that it is feasible to combine the behavior regularization method with the robust offline reinforcement learning algorithm, to gain better performance under the condition of limited data compared with the baseline algorithms.","PeriodicalId":154027,"journal":{"name":"2022 IEEE International Conference on Industry 4.0, Artificial Intelligence, and Communications Technology (IAICT)","volume":"8 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2022-07-28","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"2022 IEEE International Conference on Industry 4.0, Artificial Intelligence, and Communications Technology (IAICT)","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/IAICT55358.2022.9887435","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 0
Abstract
Offline deep reinforcement learning algorithms are still in developing. Some existing algorithms have shown that it is feasible to learn directly without using environmental interaction under the condition of sufficient datasets. In this paper, we combine an offline reinforcement learning method through behavior regularization with a robust offline reinforcement learning algorithm. Moreover, the algorithm is verified and analyzed with a high-quality but limited dataset. The experimental results show that it is feasible to combine the behavior regularization method with the robust offline reinforcement learning algorithm, to gain better performance under the condition of limited data compared with the baseline algorithms.