控制策略转移的深度强化学习

James Cunningham, S. Miller, M. Yukish, T. Simpson, Conrad S. Tucker
{"title":"控制策略转移的深度强化学习","authors":"James Cunningham, S. Miller, M. Yukish, T. Simpson, Conrad S. Tucker","doi":"10.1115/detc2019-97689","DOIUrl":null,"url":null,"abstract":"We present a form-aware reinforcement learning (RL) method to extend control knowledge from one design form to another, without losing the ability to control the original design. A major challenge in developing control knowledge is the creation of generalized control policies across designs of varying form. Our presented RL policy is form-aware because in addition to receiving dynamic state information about the environment, it also receives states that encode information about the form of the design that is being controlled. In this paper, we investigate the impact of this mixed state space on transfer learning. We present a transfer learning method for extending a control policy to a different design form, while continuing to expose the agent to the original design during the training of the new design. To demonstrate this concept, we present a case study of a multi-rotor aircraft simulation, wherein the designated task is to achieve a stable hover. We show that by introducing form states, an RL agent is able to learn a control policy to achieve the hovering task with both a four rotor and three rotor design at once, whereas without the form states it can only hover with the four rotor design. We also benchmark our method against a test case that removes the transfer learning component, as well as a test case that removes the continued exposure to the original design to show the value of each of these components. We find that form states, transfer learning, and parallel learning all contribute to a more robust control policy for the new design, and that parallel learning is especially important for maintaining control knowledge of the original design.","PeriodicalId":365601,"journal":{"name":"Volume 2A: 45th Design Automation Conference","volume":"31 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2019-11-25","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":"{\"title\":\"Deep Reinforcement Learning for Transfer of Control Policies\",\"authors\":\"James Cunningham, S. Miller, M. Yukish, T. Simpson, Conrad S. Tucker\",\"doi\":\"10.1115/detc2019-97689\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"We present a form-aware reinforcement learning (RL) method to extend control knowledge from one design form to another, without losing the ability to control the original design. A major challenge in developing control knowledge is the creation of generalized control policies across designs of varying form. Our presented RL policy is form-aware because in addition to receiving dynamic state information about the environment, it also receives states that encode information about the form of the design that is being controlled. In this paper, we investigate the impact of this mixed state space on transfer learning. We present a transfer learning method for extending a control policy to a different design form, while continuing to expose the agent to the original design during the training of the new design. To demonstrate this concept, we present a case study of a multi-rotor aircraft simulation, wherein the designated task is to achieve a stable hover. We show that by introducing form states, an RL agent is able to learn a control policy to achieve the hovering task with both a four rotor and three rotor design at once, whereas without the form states it can only hover with the four rotor design. We also benchmark our method against a test case that removes the transfer learning component, as well as a test case that removes the continued exposure to the original design to show the value of each of these components. We find that form states, transfer learning, and parallel learning all contribute to a more robust control policy for the new design, and that parallel learning is especially important for maintaining control knowledge of the original design.\",\"PeriodicalId\":365601,\"journal\":{\"name\":\"Volume 2A: 45th Design Automation Conference\",\"volume\":\"31 1\",\"pages\":\"0\"},\"PeriodicalIF\":0.0000,\"publicationDate\":\"2019-11-25\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"0\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"Volume 2A: 45th Design Automation Conference\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/10.1115/detc2019-97689\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"Volume 2A: 45th Design Automation Conference","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1115/detc2019-97689","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 0

摘要

我们提出了一种形式感知强化学习(RL)方法,将控制知识从一种设计形式扩展到另一种设计形式,而不会失去对原始设计的控制能力。发展控制知识的一个主要挑战是在不同形式的设计中创建通用的控制策略。我们提出的RL策略是表单感知的,因为除了接收有关环境的动态状态信息外,它还接收有关被控制的设计形式的编码信息的状态。在本文中,我们研究了这种混合状态空间对迁移学习的影响。我们提出了一种迁移学习方法,将控制策略扩展到不同的设计形式,同时在新设计的训练过程中继续将智能体暴露于原始设计。为了证明这一概念,我们提出了一个多旋翼飞机模拟的案例研究,其中指定的任务是实现稳定的悬停。通过引入形式状态,RL智能体能够学习控制策略,同时完成四旋翼和三旋翼悬停任务,而没有形式状态时,RL智能体只能完成四旋翼悬停任务。我们还根据一个移除迁移学习组件的测试用例对我们的方法进行基准测试,以及一个移除对原始设计的持续暴露以显示每个组件的值的测试用例。我们发现形式状态、迁移学习和并行学习都有助于为新设计提供更健壮的控制策略,并且并行学习对于保持原始设计的控制知识尤其重要。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
Deep Reinforcement Learning for Transfer of Control Policies
We present a form-aware reinforcement learning (RL) method to extend control knowledge from one design form to another, without losing the ability to control the original design. A major challenge in developing control knowledge is the creation of generalized control policies across designs of varying form. Our presented RL policy is form-aware because in addition to receiving dynamic state information about the environment, it also receives states that encode information about the form of the design that is being controlled. In this paper, we investigate the impact of this mixed state space on transfer learning. We present a transfer learning method for extending a control policy to a different design form, while continuing to expose the agent to the original design during the training of the new design. To demonstrate this concept, we present a case study of a multi-rotor aircraft simulation, wherein the designated task is to achieve a stable hover. We show that by introducing form states, an RL agent is able to learn a control policy to achieve the hovering task with both a four rotor and three rotor design at once, whereas without the form states it can only hover with the four rotor design. We also benchmark our method against a test case that removes the transfer learning component, as well as a test case that removes the continued exposure to the original design to show the value of each of these components. We find that form states, transfer learning, and parallel learning all contribute to a more robust control policy for the new design, and that parallel learning is especially important for maintaining control knowledge of the original design.
求助全文
通过发布文献求助,成功后即可免费获取论文全文。 去求助
来源期刊
自引率
0.00%
发文量
0
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
确定
请完成安全验证×
copy
已复制链接
快去分享给好友吧!
我知道了
右上角分享
点击右上角分享
0
联系我们:info@booksci.cn Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。 Copyright © 2023 布克学术 All rights reserved.
京ICP备2023020795号-1
ghs 京公网安备 11010802042870号
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术官方微信