Conference on Robot Learning最新文献

MResT: Multi-Resolution Sensing for Real-Time Control with Vision-Language Models MResT：利用视觉语言模型实现实时控制的多分辨率传感技术

Conference on Robot Learning Pub Date : 2024-01-25 DOI: 10.48550/arXiv.2401.14502

Saumya Saxena, Mohit Sharma, Oliver Kroemer

{"title":"MResT: Multi-Resolution Sensing for Real-Time Control with Vision-Language Models","authors":"Saumya Saxena, Mohit Sharma, Oliver Kroemer","doi":"10.48550/arXiv.2401.14502","DOIUrl":"https://doi.org/10.48550/arXiv.2401.14502","url":null,"abstract":"Leveraging sensing modalities across diverse spatial and temporal resolutions can improve performance of robotic manipulation tasks. Multi-spatial resolution sensing provides hierarchical information captured at different spatial scales and enables both coarse and precise motions. Simultaneously multi-temporal resolution sensing enables the agent to exhibit high reactivity and real-time control. In this work, we propose a framework, MResT (Multi-Resolution Transformer), for learning generalizable language-conditioned multi-task policies that utilize sensing at different spatial and temporal resolutions using networks of varying capacities to effectively perform real time control of precise and reactive tasks. We leverage off-the-shelf pretrained vision-language models to operate on low-frequency global features along with small non-pretrained models to adapt to high frequency local feedback. Through extensive experiments in 3 domains (coarse, precise and dynamic manipulation tasks), we show that our approach significantly improves (2X on average) over recent multi-task baselines. Further, our approach generalizes well to visual and geometric variations in target objects and to varying interaction forces.","PeriodicalId":273870,"journal":{"name":"Conference on Robot Learning","volume":"299 3","pages":"2210-2228"},"PeriodicalIF":0.0,"publicationDate":"2024-01-25","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"140495172","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 1

Lidar Line Selection with Spatially-Aware Shapley Value for Cost-Efficient Depth Completion 具有空间感知的Shapley值的激光雷达选线，用于经济高效的深度完成

Conference on Robot Learning Pub Date : 2023-03-21 DOI: 10.48550/arXiv.2303.11720

Kamil Adamczewski, Christos Sakaridis, Vaishakh Patil, L. Gool

引用次数: 0

Safe Robot Learning in Assistive Devices through Neural Network Repair 基于神经网络修复的辅助设备安全机器人学习

Conference on Robot Learning Pub Date : 2023-03-08 DOI: 10.48550/arXiv.2303.04431

K. Majd, Geoffrey Clark, Tanmay Khandait, Siyu Zhou, S. Sankaranarayanan, Georgios Fainekos, H. B. Amor

引用次数: 0

COACH: Cooperative Robot Teaching 教练:合作机器人教学

Conference on Robot Learning Pub Date : 2023-02-13 DOI: 10.48550/arXiv.2302.06199

Cunjun Yu, Yiqing Xu, Linfeng Li, David Hsu

引用次数: 3

Learning Goal-Conditioned Policies Offline with Self-Supervised Reward Shaping 基于自监督奖励塑造的目标条件策略离线学习

Conference on Robot Learning Pub Date : 2023-01-05 DOI: 10.48550/arXiv.2301.02099

Lina Mezghani, Sainbayar Sukhbaatar, Piotr Bojanowski, A. Lazaric, Alahari Karteek

引用次数: 6

Learning Road Scene-level Representations via Semantic Region Prediction 通过语义区域预测学习道路场景级表示

Conference on Robot Learning Pub Date : 2023-01-02 DOI: 10.48550/arXiv.2301.00714

Zihao Xiao, A. Yuille, Yi-Ting Chen

引用次数: 1

Offline Reinforcement Learning for Visual Navigation 视觉导航的离线强化学习

Conference on Robot Learning Pub Date : 2022-12-16 DOI: 10.48550/arXiv.2212.08244

Dhruv Shah, Arjun Bhorkar, Hrish Leen, Ilya Kostrikov, Nicholas Rhinehart, S. Levine

引用次数: 10

JFP: Joint Future Prediction with Interactive Multi-Agent Modeling for Autonomous Driving 基于交互式多智能体模型的自动驾驶联合未来预测

Conference on Robot Learning Pub Date : 2022-12-16 DOI: 10.48550/arXiv.2212.08710

Wenjie Luo, C. Park, Andre Cornman, Benjamin Sapp, Drago Anguelov

引用次数: 15

Learning Markerless Robot-Depth Camera Calibration and End-Effector Pose Estimation 学习无标记机器人深度摄像机标定和末端执行器姿态估计

Conference on Robot Learning Pub Date : 2022-12-15 DOI: 10.48550/arXiv.2212.07567

B. C. Sefercik, Barış Akgün

引用次数: 2

HUM3DIL: Semi-supervised Multi-modal 3D Human Pose Estimation for Autonomous Driving HUM3DIL：用于自动驾驶的半监督多模态三维人体姿态估计

Conference on Robot Learning Pub Date : 2022-12-15 DOI: 10.48550/arXiv.2212.07729

Andrei Zanfir, M. Zanfir, Alexander N. Gorban, Jingwei Ji, Yin Zhou, Drago Anguelov, C. Sminchisescu

{"title":"HUM3DIL: Semi-supervised Multi-modal 3D Human Pose Estimation for Autonomous Driving","authors":"Andrei Zanfir, M. Zanfir, Alexander N. Gorban, Jingwei Ji, Yin Zhou, Drago Anguelov, C. Sminchisescu","doi":"10.48550/arXiv.2212.07729","DOIUrl":"https://doi.org/10.48550/arXiv.2212.07729","url":null,"abstract":"Autonomous driving is an exciting new industry, posing important research questions. Within the perception module, 3D human pose estimation is an emerging technology, which can enable the autonomous vehicle to perceive and understand the subtle and complex behaviors of pedestrians. While hardware systems and sensors have dramatically improved over the decades -- with cars potentially boasting complex LiDAR and vision systems and with a growing expansion of the available body of dedicated datasets for this newly available information -- not much work has been done to harness these novel signals for the core problem of 3D human pose estimation. Our method, which we coin HUM3DIL (HUMan 3D from Images and LiDAR), efficiently makes use of these complementary signals, in a semi-supervised fashion and outperforms existing methods with a large margin. It is a fast and compact model for onboard deployment. Specifically, we embed LiDAR points into pixel-aligned multi-modal features, which we pass through a sequence of Transformer refinement stages. Quantitative experiments on the Waymo Open Dataset support these claims, where we achieve state-of-the-art results on the task of 3D pose estimation.","PeriodicalId":273870,"journal":{"name":"Conference on Robot Learning","volume":"22 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2022-12-15","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"125939427","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 8