深度强化学习与调整

2021 IEEE 19th International Conference on Industrial Informatics (INDIN) Pub Date : 2021-07-21 DOI:10.1109/INDIN45523.2021.9557543

H. Khorasgani, Haiyan Wang, Chetan Gupta, Susumu Serita

{"title":"深度强化学习与调整","authors":"H. Khorasgani, Haiyan Wang, Chetan Gupta, Susumu Serita","doi":"10.1109/INDIN45523.2021.9557543","DOIUrl":null,"url":null,"abstract":"Deep reinforcement learning (RL) algorithms can learn complex policies to optimize agent operation over time. RL algorithms have shown promising results in solving complicated problems in recent years. However, their application on real-world physical systems remains limited. Despite the advancements in RL algorithms, the industries often prefer traditional control strategies. Traditional methods are simple, computationally efficient and easy to adjust. In this paper, we first propose a new Q-learning algorithm for continuous action space, which can bridge the control and RL algorithms and bring us the best of both worlds. Our method can learn complex policies to achieve long-term goals and at the same time it can be easily adjusted to address short-term requirements without retraining. Next, we present an approximation of our algorithm which can be applied to address short-term requirements of any pre-trained RL algorithm. The case studies demonstrate that both our proposed method as well as its practical approximation can achieve short-term and long-term goals without complex reward functions.","PeriodicalId":370921,"journal":{"name":"2021 IEEE 19th International Conference on Industrial Informatics (INDIN)","volume":"1 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2021-07-21","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"1","resultStr":"{\"title\":\"Deep Reinforcement Learning with Adjustments\",\"authors\":\"H. Khorasgani, Haiyan Wang, Chetan Gupta, Susumu Serita\",\"doi\":\"10.1109/INDIN45523.2021.9557543\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"Deep reinforcement learning (RL) algorithms can learn complex policies to optimize agent operation over time. RL algorithms have shown promising results in solving complicated problems in recent years. However, their application on real-world physical systems remains limited. Despite the advancements in RL algorithms, the industries often prefer traditional control strategies. Traditional methods are simple, computationally efficient and easy to adjust. In this paper, we first propose a new Q-learning algorithm for continuous action space, which can bridge the control and RL algorithms and bring us the best of both worlds. Our method can learn complex policies to achieve long-term goals and at the same time it can be easily adjusted to address short-term requirements without retraining. Next, we present an approximation of our algorithm which can be applied to address short-term requirements of any pre-trained RL algorithm. The case studies demonstrate that both our proposed method as well as its practical approximation can achieve short-term and long-term goals without complex reward functions.\",\"PeriodicalId\":370921,\"journal\":{\"name\":\"2021 IEEE 19th International Conference on Industrial Informatics (INDIN)\",\"volume\":\"1 1\",\"pages\":\"0\"},\"PeriodicalIF\":0.0000,\"publicationDate\":\"2021-07-21\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"1\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"2021 IEEE 19th International Conference on Industrial Informatics (INDIN)\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/10.1109/INDIN45523.2021.9557543\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"2021 IEEE 19th International Conference on Industrial Informatics (INDIN)","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/INDIN45523.2021.9557543","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}

引用次数: 1

摘要

深度强化学习(RL)算法可以学习复杂的策略来优化智能体的运行。近年来，强化学习算法在解决复杂问题方面显示出良好的效果。然而，它们在现实世界物理系统中的应用仍然有限。尽管强化学习算法取得了进步，但行业通常更喜欢传统的控制策略。传统方法简单，计算效率高，易于调整。在本文中，我们首先提出了一种新的连续动作空间的q -学习算法，它可以桥接控制和强化学习算法，并为我们带来两全其美的效果。我们的方法可以学习复杂的政策来实现长期目标，同时它可以很容易地调整以满足短期需求，而无需再培训。接下来，我们提出了我们的算法的近似值，可以应用于解决任何预训练RL算法的短期需求。案例研究表明，我们提出的方法及其实际逼近都可以实现短期和长期目标，而不需要复杂的奖励函数。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

查看原文本刊更多论文

Deep Reinforcement Learning with Adjustments

Deep reinforcement learning (RL) algorithms can learn complex policies to optimize agent operation over time. RL algorithms have shown promising results in solving complicated problems in recent years. However, their application on real-world physical systems remains limited. Despite the advancements in RL algorithms, the industries often prefer traditional control strategies. Traditional methods are simple, computationally efficient and easy to adjust. In this paper, we first propose a new Q-learning algorithm for continuous action space, which can bridge the control and RL algorithms and bring us the best of both worlds. Our method can learn complex policies to achieve long-term goals and at the same time it can be easily adjusted to address short-term requirements without retraining. Next, we present an approximation of our algorithm which can be applied to address short-term requirements of any pre-trained RL algorithm. The case studies demonstrate that both our proposed method as well as its practical approximation can achieve short-term and long-term goals without complex reward functions.

求助全文

通过发布文献求助，成功后即可免费获取论文全文。去求助

来源期刊

2021 IEEE 19th International Conference on Industrial Informatics (INDIN)

自引率

0.00%

发文量