利用深度逆强化学习学习隐性社会导航行为

IF 4.6 2区 计算机科学 Q2 ROBOTICS
Tribhi Kathuria;Ke Liu;Junwoo Jang;X. Jessie Yang;Maani Ghaffari
{"title":"利用深度逆强化学习学习隐性社会导航行为","authors":"Tribhi Kathuria;Ke Liu;Junwoo Jang;X. Jessie Yang;Maani Ghaffari","doi":"10.1109/LRA.2025.3557299","DOIUrl":null,"url":null,"abstract":"This paper reports on learning a reward map for social navigation in dynamic environments where the robot can reason about its path at any time, given agent trajectories and scene geometry. Humans navigating in dense and dynamic indoor environments often work with several implied social rules. A rule-based approach fails to model all possible interactions between humans, robots, and scenes. We propose a novel Smooth Maximum Entropy Deep Inverse Reinforcement Learning (S-MEDIRL) algorithm that can extrapolate beyond expert demos to better encode scene navigability from few-shot demonstrations. The agent learns to predict the cost maps based on trajectory data as well as scene geometry. The trajectory sampled from the learned cost map is then executed using a local crowd navigation controller. We present results in a photo-realistic simulation environment, with a robot and a human navigating a narrow crossing scenario. The robot implicitly learns to exhibit social behaviors such as yielding to oncoming traffic and avoiding deadlocks. We compare the proposed approach to the popular model-based crowd navigation algorithm ORCA and a rule-based agent that exhibits yielding.","PeriodicalId":13241,"journal":{"name":"IEEE Robotics and Automation Letters","volume":"10 5","pages":"5146-5153"},"PeriodicalIF":4.6000,"publicationDate":"2025-04-02","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":"{\"title\":\"Learning Implicit Social Navigation Behavior Using Deep Inverse Reinforcement Learning\",\"authors\":\"Tribhi Kathuria;Ke Liu;Junwoo Jang;X. Jessie Yang;Maani Ghaffari\",\"doi\":\"10.1109/LRA.2025.3557299\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"This paper reports on learning a reward map for social navigation in dynamic environments where the robot can reason about its path at any time, given agent trajectories and scene geometry. Humans navigating in dense and dynamic indoor environments often work with several implied social rules. A rule-based approach fails to model all possible interactions between humans, robots, and scenes. We propose a novel Smooth Maximum Entropy Deep Inverse Reinforcement Learning (S-MEDIRL) algorithm that can extrapolate beyond expert demos to better encode scene navigability from few-shot demonstrations. The agent learns to predict the cost maps based on trajectory data as well as scene geometry. The trajectory sampled from the learned cost map is then executed using a local crowd navigation controller. We present results in a photo-realistic simulation environment, with a robot and a human navigating a narrow crossing scenario. The robot implicitly learns to exhibit social behaviors such as yielding to oncoming traffic and avoiding deadlocks. We compare the proposed approach to the popular model-based crowd navigation algorithm ORCA and a rule-based agent that exhibits yielding.\",\"PeriodicalId\":13241,\"journal\":{\"name\":\"IEEE Robotics and Automation Letters\",\"volume\":\"10 5\",\"pages\":\"5146-5153\"},\"PeriodicalIF\":4.6000,\"publicationDate\":\"2025-04-02\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"0\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"IEEE Robotics and Automation Letters\",\"FirstCategoryId\":\"94\",\"ListUrlMain\":\"https://ieeexplore.ieee.org/document/10947583/\",\"RegionNum\":2,\"RegionCategory\":\"计算机科学\",\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"Q2\",\"JCRName\":\"ROBOTICS\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"IEEE Robotics and Automation Letters","FirstCategoryId":"94","ListUrlMain":"https://ieeexplore.ieee.org/document/10947583/","RegionNum":2,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q2","JCRName":"ROBOTICS","Score":null,"Total":0}
引用次数: 0

摘要

本文报道了在动态环境中学习一种用于社交导航的奖励地图,在这种环境中,机器人可以在给定智能体轨迹和场景几何的情况下随时推断其路径。在密集和动态的室内环境中导航的人类通常会遵循一些隐含的社会规则。基于规则的方法无法对人、机器人和场景之间所有可能的交互进行建模。我们提出了一种新颖的平滑最大熵深度逆强化学习(S-MEDIRL)算法,该算法可以在专家演示之外进行外推,从而更好地从少量演示中编码场景可导航性。智能体学习基于轨迹数据和场景几何来预测成本图。然后使用本地人群导航控制器执行从学习成本图中采样的轨迹。我们在一个逼真的模拟环境中展示了结果,一个机器人和一个人在狭窄的交叉场景中导航。机器人会隐性地学习表现出社交行为,比如向迎面而来的车辆让步,避免出现交通堵塞。我们将提出的方法与流行的基于模型的人群导航算法ORCA和基于规则的智能体进行了比较。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
Learning Implicit Social Navigation Behavior Using Deep Inverse Reinforcement Learning
This paper reports on learning a reward map for social navigation in dynamic environments where the robot can reason about its path at any time, given agent trajectories and scene geometry. Humans navigating in dense and dynamic indoor environments often work with several implied social rules. A rule-based approach fails to model all possible interactions between humans, robots, and scenes. We propose a novel Smooth Maximum Entropy Deep Inverse Reinforcement Learning (S-MEDIRL) algorithm that can extrapolate beyond expert demos to better encode scene navigability from few-shot demonstrations. The agent learns to predict the cost maps based on trajectory data as well as scene geometry. The trajectory sampled from the learned cost map is then executed using a local crowd navigation controller. We present results in a photo-realistic simulation environment, with a robot and a human navigating a narrow crossing scenario. The robot implicitly learns to exhibit social behaviors such as yielding to oncoming traffic and avoiding deadlocks. We compare the proposed approach to the popular model-based crowd navigation algorithm ORCA and a rule-based agent that exhibits yielding.
求助全文
通过发布文献求助,成功后即可免费获取论文全文。 去求助
来源期刊
IEEE Robotics and Automation Letters
IEEE Robotics and Automation Letters Computer Science-Computer Science Applications
CiteScore
9.60
自引率
15.40%
发文量
1428
期刊介绍: The scope of this journal is to publish peer-reviewed articles that provide a timely and concise account of innovative research ideas and application results, reporting significant theoretical findings and application case studies in areas of robotics and automation.
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
确定
请完成安全验证×
copy
已复制链接
快去分享给好友吧!
我知道了
右上角分享
点击右上角分享
0
联系我们:info@booksci.cn Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。 Copyright © 2023 布克学术 All rights reserved.
京ICP备2023020795号-1
ghs 京公网安备 11010802042870号
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术官方微信