Conference on Robot Learning最新文献

筛选
英文 中文
Vision-based Uneven BEV Representation Learning with Polar Rasterization and Surface Estimation 基于视觉的非均匀BEV表示学习与极坐标栅格化和表面估计
Conference on Robot Learning Pub Date : 2022-07-05 DOI: 10.48550/arXiv.2207.01878
Zhi Liu, Shaoyu Chen, Xiaojie Guo, Xinggang Wang, Tianheng Cheng, Hong Zhu, Qian Zhang, Wenyu Liu, Yi Zhang
{"title":"Vision-based Uneven BEV Representation Learning with Polar Rasterization and Surface Estimation","authors":"Zhi Liu, Shaoyu Chen, Xiaojie Guo, Xinggang Wang, Tianheng Cheng, Hong Zhu, Qian Zhang, Wenyu Liu, Yi Zhang","doi":"10.48550/arXiv.2207.01878","DOIUrl":"https://doi.org/10.48550/arXiv.2207.01878","url":null,"abstract":"In this work, we propose PolarBEV for vision-based uneven BEV representation learning. To adapt to the foreshortening effect of camera imaging, we rasterize the BEV space both angularly and radially, and introduce polar embedding decomposition to model the associations among polar grids. Polar grids are rearranged to an array-like regular representation for efficient processing. Besides, to determine the 2D-to-3D correspondence, we iteratively update the BEV surface based on a hypothetical plane, and adopt height-based feature transformation. PolarBEV keeps real-time inference speed on a single 2080Ti GPU, and outperforms other methods for both BEV semantic segmentation and BEV instance segmentation. Thorough ablations are presented to validate the design. The code will be released at url{https://github.com/SuperZ-Liu/PolarBEV}.","PeriodicalId":273870,"journal":{"name":"Conference on Robot Learning","volume":"43 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2022-07-05","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"116051012","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 12
USHER: Unbiased Sampling for Hindsight Experience Replay 无偏采样的后见之明经验回放
Conference on Robot Learning Pub Date : 2022-07-03 DOI: 10.48550/arXiv.2207.01115
Liam Schramm, Yunfu Deng, Edgar Granados, Abdeslam Boularias
{"title":"USHER: Unbiased Sampling for Hindsight Experience Replay","authors":"Liam Schramm, Yunfu Deng, Edgar Granados, Abdeslam Boularias","doi":"10.48550/arXiv.2207.01115","DOIUrl":"https://doi.org/10.48550/arXiv.2207.01115","url":null,"abstract":"Dealing with sparse rewards is a long-standing challenge in reinforcement learning (RL). Hindsight Experience Replay (HER) addresses this problem by reusing failed trajectories for one goal as successful trajectories for another. This allows for both a minimum density of reward and for generalization across multiple goals. However, this strategy is known to result in a biased value function, as the update rule underestimates the likelihood of bad outcomes in a stochastic environment. We propose an asymptotically unbiased importance-sampling-based algorithm to address this problem without sacrificing performance on deterministic environments. We show its effectiveness on a range of robotic systems, including challenging high dimensional stochastic environments.","PeriodicalId":273870,"journal":{"name":"Conference on Robot Learning","volume":"19 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2022-07-03","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"127786512","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 1
Discriminator-Guided Model-Based Offline Imitation Learning 基于鉴别器引导模型的离线模仿学习
Conference on Robot Learning Pub Date : 2022-07-01 DOI: 10.48550/arXiv.2207.00244
Wenjia Zhang, Haoran Xu, Haoyi Niu, Peng Cheng, Ming Li, Heming Zhang, Guyue Zhou, Xianyuan Zhan
{"title":"Discriminator-Guided Model-Based Offline Imitation Learning","authors":"Wenjia Zhang, Haoran Xu, Haoyi Niu, Peng Cheng, Ming Li, Heming Zhang, Guyue Zhou, Xianyuan Zhan","doi":"10.48550/arXiv.2207.00244","DOIUrl":"https://doi.org/10.48550/arXiv.2207.00244","url":null,"abstract":"Offline imitation learning (IL) is a powerful method to solve decision-making problems from expert demonstrations without reward labels. Existing offline IL methods suffer from severe performance degeneration under limited expert data. Including a learned dynamics model can potentially improve the state-action space coverage of expert data, however, it also faces challenging issues like model approximation/generalization errors and suboptimality of rollout data. In this paper, we propose the Discriminator-guided Model-based offline Imitation Learning (DMIL) framework, which introduces a discriminator to simultaneously distinguish the dynamics correctness and suboptimality of model rollout data against real expert demonstrations. DMIL adopts a novel cooperative-yet-adversarial learning strategy, which uses the discriminator to guide and couple the learning process of the policy and dynamics model, resulting in improved model performance and robustness. Our framework can also be extended to the case when demonstrations contain a large proportion of suboptimal data. Experimental results show that DMIL and its extension achieve superior performance and robustness compared to state-of-the-art offline IL methods under small datasets.","PeriodicalId":273870,"journal":{"name":"Conference on Robot Learning","volume":"145 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2022-07-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"131254668","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 5
Learning Diverse and Physically Feasible Dexterous Grasps with Generative Model and Bilevel Optimization 用生成模型和双层优化学习多种物理可行的灵巧掌握
Conference on Robot Learning Pub Date : 2022-07-01 DOI: 10.48550/arXiv.2207.00195
A. Wu, Michelle Guo, Karen Liu
{"title":"Learning Diverse and Physically Feasible Dexterous Grasps with Generative Model and Bilevel Optimization","authors":"A. Wu, Michelle Guo, Karen Liu","doi":"10.48550/arXiv.2207.00195","DOIUrl":"https://doi.org/10.48550/arXiv.2207.00195","url":null,"abstract":"To fully utilize the versatility of a multi-fingered dexterous robotic hand for executing diverse object grasps, one must consider the rich physical constraints introduced by hand-object interaction and object geometry. We propose an integrative approach of combining a generative model and a bilevel optimization (BO) to plan diverse grasp configurations on novel objects. First, a conditional variational autoencoder trained on merely six YCB objects predicts the finger placement directly from the object point cloud. The prediction is then used to seed a nonconvex BO that solves for a grasp configuration under collision, reachability, wrench closure, and friction constraints. Our method achieved an 86.7% success over 120 real world grasping trials on 20 household objects, including unseen and challenging geometries. Through quantitative empirical evaluations, we confirm that grasp configurations produced by our pipeline are indeed guaranteed to satisfy kinematic and dynamic constraints. A video summary of our results is available at youtu.be/9DTrImbN99I.","PeriodicalId":273870,"journal":{"name":"Conference on Robot Learning","volume":"27 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2022-07-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"121367355","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 10
Watch and Match: Supercharging Imitation with Regularized Optimal Transport 观察与匹配:正则化最优运输的增压模仿
Conference on Robot Learning Pub Date : 2022-06-30 DOI: 10.48550/arXiv.2206.15469
Siddhant Haldar, Vaibhav Mathur, Denis Yarats, Lerrel Pinto
{"title":"Watch and Match: Supercharging Imitation with Regularized Optimal Transport","authors":"Siddhant Haldar, Vaibhav Mathur, Denis Yarats, Lerrel Pinto","doi":"10.48550/arXiv.2206.15469","DOIUrl":"https://doi.org/10.48550/arXiv.2206.15469","url":null,"abstract":"Imitation learning holds tremendous promise in learning policies efficiently for complex decision making problems. Current state-of-the-art algorithms often use inverse reinforcement learning (IRL), where given a set of expert demonstrations, an agent alternatively infers a reward function and the associated optimal policy. However, such IRL approaches often require substantial online interactions for complex control problems. In this work, we present Regularized Optimal Transport (ROT), a new imitation learning algorithm that builds on recent advances in optimal transport based trajectory-matching. Our key technical insight is that adaptively combining trajectory-matching rewards with behavior cloning can significantly accelerate imitation even with only a few demonstrations. Our experiments on 20 visual control tasks across the DeepMind Control Suite, the OpenAI Robotics Suite, and the Meta-World Benchmark demonstrate an average of 7.8X faster imitation to reach 90% of expert performance compared to prior state-of-the-art methods. On real-world robotic manipulation, with just one demonstration and an hour of online training, ROT achieves an average success rate of 90.1% across 14 tasks.","PeriodicalId":273870,"journal":{"name":"Conference on Robot Learning","volume":"605 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2022-06-30","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"116382638","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 28
Fleet-DAgger: Interactive Robot Fleet Learning with Scalable Human Supervision Fleet- dagger:具有可扩展人类监督的交互式机器人舰队学习
Conference on Robot Learning Pub Date : 2022-06-29 DOI: 10.48550/arXiv.2206.14349
Ryan Hoque, Lawrence Yunliang Chen, Satvik Sharma, K. Dharmarajan, Brijen Thananjeyan, P. Abbeel, Ken Goldberg
{"title":"Fleet-DAgger: Interactive Robot Fleet Learning with Scalable Human Supervision","authors":"Ryan Hoque, Lawrence Yunliang Chen, Satvik Sharma, K. Dharmarajan, Brijen Thananjeyan, P. Abbeel, Ken Goldberg","doi":"10.48550/arXiv.2206.14349","DOIUrl":"https://doi.org/10.48550/arXiv.2206.14349","url":null,"abstract":"Commercial and industrial deployments of robot fleets at Amazon, Nimble, Plus One, Waymo, and Zoox query remote human teleoperators when robots are at risk or unable to make task progress. With continual learning, interventions from the remote pool of humans can also be used to improve the robot fleet control policy over time. A central question is how to effectively allocate limited human attention. Prior work addresses this in the single-robot, single-human setting; we formalize the Interactive Fleet Learning (IFL) setting, in which multiple robots interactively query and learn from multiple human supervisors. We propose Return on Human Effort (ROHE) as a new metric and Fleet-DAgger, a family of IFL algorithms. We present an open-source IFL benchmark suite of GPU-accelerated Isaac Gym environments for standardized evaluation and development of IFL algorithms. We compare a novel Fleet-DAgger algorithm to 4 baselines with 100 robots in simulation. We also perform a physical block-pushing experiment with 4 ABB YuMi robot arms and 2 remote humans. Experiments suggest that the allocation of humans to robots significantly affects the performance of the fleet, and that the novel Fleet-DAgger algorithm can achieve up to 8.8x higher ROHE than baselines. See https://tinyurl.com/fleet-dagger for supplemental material.","PeriodicalId":273870,"journal":{"name":"Conference on Robot Learning","volume":"7 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2022-06-29","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"116153148","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 10
Masked World Models for Visual Control 用于视觉控制的蒙面世界模型
Conference on Robot Learning Pub Date : 2022-06-28 DOI: 10.48550/arXiv.2206.14244
Younggyo Seo, Danijar Hafner, Hao Liu, Fangchen Liu, Stephen James, Kimin Lee, P. Abbeel
{"title":"Masked World Models for Visual Control","authors":"Younggyo Seo, Danijar Hafner, Hao Liu, Fangchen Liu, Stephen James, Kimin Lee, P. Abbeel","doi":"10.48550/arXiv.2206.14244","DOIUrl":"https://doi.org/10.48550/arXiv.2206.14244","url":null,"abstract":"Visual model-based reinforcement learning (RL) has the potential to enable sample-efficient robot learning from visual observations. Yet the current approaches typically train a single model end-to-end for learning both visual representations and dynamics, making it difficult to accurately model the interaction between robots and small objects. In this work, we introduce a visual model-based RL framework that decouples visual representation learning and dynamics learning. Specifically, we train an autoencoder with convolutional layers and vision transformers (ViT) to reconstruct pixels given masked convolutional features, and learn a latent dynamics model that operates on the representations from the autoencoder. Moreover, to encode task-relevant information, we introduce an auxiliary reward prediction objective for the autoencoder. We continually update both autoencoder and dynamics model using online samples collected from environment interaction. We demonstrate that our decoupling approach achieves state-of-the-art performance on a variety of visual robotic tasks from Meta-world and RLBench, e.g., we achieve 81.7% success rate on 50 visual robotic manipulation tasks from Meta-world, while the baseline achieves 67.9%. Code is available on the project website: https://sites.google.com/view/mwm-rl.","PeriodicalId":273870,"journal":{"name":"Conference on Robot Learning","volume":"85 7","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2022-06-28","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"131958440","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 56
Rethinking Optimization with Differentiable Simulation from a Global Perspective 全局视角下可微模拟优化的再思考
Conference on Robot Learning Pub Date : 2022-06-28 DOI: 10.48550/arXiv.2207.00167
Rika Antonova, Jingyun Yang, Krishna Murthy Jatavallabhula, J. Bohg
{"title":"Rethinking Optimization with Differentiable Simulation from a Global Perspective","authors":"Rika Antonova, Jingyun Yang, Krishna Murthy Jatavallabhula, J. Bohg","doi":"10.48550/arXiv.2207.00167","DOIUrl":"https://doi.org/10.48550/arXiv.2207.00167","url":null,"abstract":"Differentiable simulation is a promising toolkit for fast gradient-based policy optimization and system identification. However, existing approaches to differentiable simulation have largely tackled scenarios where obtaining smooth gradients has been relatively easy, such as systems with mostly smooth dynamics. In this work, we study the challenges that differentiable simulation presents when it is not feasible to expect that a single descent reaches a global optimum, which is often a problem in contact-rich scenarios. We analyze the optimization landscapes of diverse scenarios that contain both rigid bodies and deformable objects. In dynamic environments with highly deformable objects and fluids, differentiable simulators produce rugged landscapes with nonetheless useful gradients in some parts of the space. We propose a method that combines Bayesian optimization with semi-local 'leaps' to obtain a global search method that can use gradients effectively, while also maintaining robust performance in regions with noisy gradients. We show that our approach outperforms several gradient-based and gradient-free baselines on an extensive set of experiments in simulation, and also validate the method using experiments with a real robot and deformables. Videos and supplementary materials are available at https://tinyurl.com/globdiff","PeriodicalId":273870,"journal":{"name":"Conference on Robot Learning","volume":"3 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2022-06-28","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"130450718","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 10
DayDreamer: World Models for Physical Robot Learning 白日梦者:物理机器人学习的世界模型
Conference on Robot Learning Pub Date : 2022-06-28 DOI: 10.48550/arXiv.2206.14176
Philipp Wu, Alejandro Escontrela, Danijar Hafner, Ken Goldberg, P. Abbeel
{"title":"DayDreamer: World Models for Physical Robot Learning","authors":"Philipp Wu, Alejandro Escontrela, Danijar Hafner, Ken Goldberg, P. Abbeel","doi":"10.48550/arXiv.2206.14176","DOIUrl":"https://doi.org/10.48550/arXiv.2206.14176","url":null,"abstract":"To solve tasks in complex environments, robots need to learn from experience. Deep reinforcement learning is a common approach to robot learning but requires a large amount of trial and error to learn, limiting its deployment in the physical world. As a consequence, many advances in robot learning rely on simulators. On the other hand, learning inside of simulators fails to capture the complexity of the real world, is prone to simulator inaccuracies, and the resulting behaviors do not adapt to changes in the world. The Dreamer algorithm has recently shown great promise for learning from small amounts of interaction by planning within a learned world model, outperforming pure reinforcement learning in video games. Learning a world model to predict the outcomes of potential actions enables planning in imagination, reducing the amount of trial and error needed in the real environment. However, it is unknown whether Dreamer can facilitate faster learning on physical robots. In this paper, we apply Dreamer to 4 robots to learn online and directly in the real world, without simulators. Dreamer trains a quadruped robot to roll off its back, stand up, and walk from scratch and without resets in only 1 hour. We then push the robot and find that Dreamer adapts within 10 minutes to withstand perturbations or quickly roll over and stand back up. On two different robotic arms, Dreamer learns to pick and place multiple objects directly from camera images and sparse rewards, approaching human performance. On a wheeled robot, Dreamer learns to navigate to a goal position purely from camera images, automatically resolving ambiguity about the robot orientation. Using the same hyperparameters across all experiments, we find that Dreamer is capable of online learning in the real world, establishing a strong baseline. We release our infrastructure for future applications of world models to robot learning.","PeriodicalId":273870,"journal":{"name":"Conference on Robot Learning","volume":"6 7","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2022-06-28","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"114023833","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 105
LaRa: Latents and Rays for Multi-Camera Bird's-Eye-View Semantic Segmentation LaRa:多相机鸟瞰语义分割的潜势和光线
Conference on Robot Learning Pub Date : 2022-06-27 DOI: 10.48550/arXiv.2206.13294
Florent Bartoccioni, 'Eloi Zablocki, Andrei Bursuc, Patrick P'erez, M. Cord, Alahari Karteek
{"title":"LaRa: Latents and Rays for Multi-Camera Bird's-Eye-View Semantic Segmentation","authors":"Florent Bartoccioni, 'Eloi Zablocki, Andrei Bursuc, Patrick P'erez, M. Cord, Alahari Karteek","doi":"10.48550/arXiv.2206.13294","DOIUrl":"https://doi.org/10.48550/arXiv.2206.13294","url":null,"abstract":"Recent works in autonomous driving have widely adopted the bird's-eye-view (BEV) semantic map as an intermediate representation of the world. Online prediction of these BEV maps involves non-trivial operations such as multi-camera data extraction as well as fusion and projection into a common topview grid. This is usually done with error-prone geometric operations (e.g., homography or back-projection from monocular depth estimation) or expensive direct dense mapping between image pixels and pixels in BEV (e.g., with MLP or attention). In this work, we present 'LaRa', an efficient encoder-decoder, transformer-based model for vehicle semantic segmentation from multiple cameras. Our approach uses a system of cross-attention to aggregate information over multiple sensors into a compact, yet rich, collection of latent representations. These latent representations, after being processed by a series of self-attention blocks, are then reprojected with a second cross-attention in the BEV space. We demonstrate that our model outperforms the best previous works using transformers on nuScenes. The code and trained models are available at https://github.com/valeoai/LaRa","PeriodicalId":273870,"journal":{"name":"Conference on Robot Learning","volume":"1 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2022-06-27","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"128479094","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 12
0
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
确定
请完成安全验证×
相关产品
×
本文献相关产品
联系我们:info@booksci.cn Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。 Copyright © 2023 布克学术 All rights reserved.
京ICP备2023020795号-1
ghs 京公网安备 11010802042870号
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术官方微信