基于顾问的自主导航智能体样本高效训练与强化学习

IF 2.9 Q2 ROBOTICS
Robotics Pub Date : 2023-09-28 DOI:10.3390/robotics12050133
Rukshan Darshana Wijesinghe, Dumindu Tissera, Mihira Kasun Vithanage, Alex Xavier, Subha Fernando, Jayathu Samarawickrama
{"title":"基于顾问的自主导航智能体样本高效训练与强化学习","authors":"Rukshan Darshana Wijesinghe, Dumindu Tissera, Mihira Kasun Vithanage, Alex Xavier, Subha Fernando, Jayathu Samarawickrama","doi":"10.3390/robotics12050133","DOIUrl":null,"url":null,"abstract":"Recent advancements in artificial intelligence have enabled reinforcement learning (RL) agents to exceed human-level performance in various gaming tasks. However, despite the state-of-the-art performance demonstrated by model-free RL algorithms, they suffer from high sample complexity. Hence, it is uncommon to find their applications in robotics, autonomous navigation, and self-driving, as gathering many samples is impractical in real-world hardware systems. Therefore, developing sample-efficient learning algorithms for RL agents is crucial in deploying them in real-world tasks without sacrificing performance. This paper presents an advisor-based learning algorithm, incorporating prior knowledge into the training by modifying the deep deterministic policy gradient algorithm to reduce the sample complexity. Also, we propose an effective method of employing an advisor in data collection to train autonomous navigation agents to maneuver physical platforms, minimizing the risk of collision. We analyze the performance of our methods with the support of simulation and physical experimental setups. Experiments reveal that incorporating an advisor into the training phase significantly reduces the sample complexity without compromising the agent’s performance compared to various benchmark approaches. Also, they show that the advisor’s constant involvement in the data collection process diminishes the agent’s performance, while the limited involvement makes training more effective.","PeriodicalId":37568,"journal":{"name":"Robotics","volume":"4 1","pages":"0"},"PeriodicalIF":2.9000,"publicationDate":"2023-09-28","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":"{\"title\":\"An Advisor-Based Architecture for a Sample-Efficient Training of Autonomous Navigation Agents with Reinforcement Learning\",\"authors\":\"Rukshan Darshana Wijesinghe, Dumindu Tissera, Mihira Kasun Vithanage, Alex Xavier, Subha Fernando, Jayathu Samarawickrama\",\"doi\":\"10.3390/robotics12050133\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"Recent advancements in artificial intelligence have enabled reinforcement learning (RL) agents to exceed human-level performance in various gaming tasks. However, despite the state-of-the-art performance demonstrated by model-free RL algorithms, they suffer from high sample complexity. Hence, it is uncommon to find their applications in robotics, autonomous navigation, and self-driving, as gathering many samples is impractical in real-world hardware systems. Therefore, developing sample-efficient learning algorithms for RL agents is crucial in deploying them in real-world tasks without sacrificing performance. This paper presents an advisor-based learning algorithm, incorporating prior knowledge into the training by modifying the deep deterministic policy gradient algorithm to reduce the sample complexity. Also, we propose an effective method of employing an advisor in data collection to train autonomous navigation agents to maneuver physical platforms, minimizing the risk of collision. We analyze the performance of our methods with the support of simulation and physical experimental setups. Experiments reveal that incorporating an advisor into the training phase significantly reduces the sample complexity without compromising the agent’s performance compared to various benchmark approaches. Also, they show that the advisor’s constant involvement in the data collection process diminishes the agent’s performance, while the limited involvement makes training more effective.\",\"PeriodicalId\":37568,\"journal\":{\"name\":\"Robotics\",\"volume\":\"4 1\",\"pages\":\"0\"},\"PeriodicalIF\":2.9000,\"publicationDate\":\"2023-09-28\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"0\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"Robotics\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/10.3390/robotics12050133\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"Q2\",\"JCRName\":\"ROBOTICS\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"Robotics","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.3390/robotics12050133","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q2","JCRName":"ROBOTICS","Score":null,"Total":0}
引用次数: 0

摘要

人工智能的最新进展使强化学习(RL)代理在各种游戏任务中的表现超过了人类水平。然而,尽管无模型强化学习算法具有最先进的性能,但它们的样本复杂性很高。因此,很难在机器人、自主导航和自动驾驶中找到它们的应用,因为在现实世界的硬件系统中收集许多样本是不切实际的。因此,为强化学习代理开发样本高效学习算法对于在不牺牲性能的情况下将其部署到实际任务中至关重要。本文提出了一种基于顾问的学习算法,通过修改深度确定性策略梯度算法,将先验知识引入到训练中,以降低样本复杂度。此外,我们还提出了一种有效的方法,即在数据收集中使用顾问来训练自主导航代理来操纵物理平台,从而最大限度地降低碰撞风险。我们在模拟和物理实验设置的支持下分析了我们的方法的性能。实验表明,与各种基准方法相比,将顾问纳入训练阶段显着降低了样本复杂性,而不会影响代理的性能。此外,他们还表明,顾问在数据收集过程中的持续参与会降低代理的绩效,而有限的参与会使培训更有效。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
An Advisor-Based Architecture for a Sample-Efficient Training of Autonomous Navigation Agents with Reinforcement Learning
Recent advancements in artificial intelligence have enabled reinforcement learning (RL) agents to exceed human-level performance in various gaming tasks. However, despite the state-of-the-art performance demonstrated by model-free RL algorithms, they suffer from high sample complexity. Hence, it is uncommon to find their applications in robotics, autonomous navigation, and self-driving, as gathering many samples is impractical in real-world hardware systems. Therefore, developing sample-efficient learning algorithms for RL agents is crucial in deploying them in real-world tasks without sacrificing performance. This paper presents an advisor-based learning algorithm, incorporating prior knowledge into the training by modifying the deep deterministic policy gradient algorithm to reduce the sample complexity. Also, we propose an effective method of employing an advisor in data collection to train autonomous navigation agents to maneuver physical platforms, minimizing the risk of collision. We analyze the performance of our methods with the support of simulation and physical experimental setups. Experiments reveal that incorporating an advisor into the training phase significantly reduces the sample complexity without compromising the agent’s performance compared to various benchmark approaches. Also, they show that the advisor’s constant involvement in the data collection process diminishes the agent’s performance, while the limited involvement makes training more effective.
求助全文
通过发布文献求助,成功后即可免费获取论文全文。 去求助
来源期刊
Robotics
Robotics Mathematics-Control and Optimization
CiteScore
6.70
自引率
8.10%
发文量
114
审稿时长
11 weeks
期刊介绍: Robotics publishes original papers, technical reports, case studies, review papers and tutorials in all the aspects of robotics. Special Issues devoted to important topics in advanced robotics will be published from time to time. It particularly welcomes those emerging methodologies and techniques which bridge theoretical studies and applications and have significant potential for real-world applications. It provides a forum for information exchange between professionals, academicians and engineers who are working in the area of robotics, helping them to disseminate research findings and to learn from each other’s work. Suitable topics include, but are not limited to: -intelligent robotics, mechatronics, and biomimetics -novel and biologically-inspired robotics -modelling, identification and control of robotic systems -biomedical, rehabilitation and surgical robotics -exoskeletons, prosthetics and artificial organs -AI, neural networks and fuzzy logic in robotics -multimodality human-machine interaction -wireless sensor networks for robot navigation -multi-sensor data fusion and SLAM
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
确定
请完成安全验证×
copy
已复制链接
快去分享给好友吧!
我知道了
右上角分享
点击右上角分享
0
联系我们:info@booksci.cn Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。 Copyright © 2023 布克学术 All rights reserved.
京ICP备2023020795号-1
ghs 京公网安备 11010802042870号
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术官方微信