OIL: Observational Imitation Learning

Robotics: Science and Systems XV Pub Date : 2018-03-03 DOI:10.15607/RSS.2019.XV.005

G. Li, Matthias Müller, Vincent Casser, Neil G. Smith, D. Michels, Bernard Ghanem

{"title":"OIL: Observational Imitation Learning","authors":"G. Li, Matthias Müller, Vincent Casser, Neil G. Smith, D. Michels, Bernard Ghanem","doi":"10.15607/RSS.2019.XV.005","DOIUrl":null,"url":null,"abstract":"Recent work has explored the problem of autonomous navigation by imitating a teacher and learning an end-to-end policy, which directly predicts controls from raw images. However, these approaches tend to be sensitive to mistakes by the teacher and do not scale well to other environments or vehicles. To this end, we propose Observational Imitation Learning (OIL), a novel imitation learning variant that supports online training and automatic selection of optimal behavior by observing multiple imperfect teachers. We apply our proposed methodology to the challenging problems of autonomous driving and UAV racing. For both tasks, we utilize the Sim4CV simulator that enables the generation of large amounts of synthetic training data and also allows for online learning and evaluation. We train a perception network to predict waypoints from raw image data and use OIL to train another network to predict controls from these waypoints. Extensive experiments demonstrate that our trained network outperforms its teachers, conventional imitation learning (IL) and reinforcement learning (RL) baselines and even humans in simulation. The project website is available at this https URL and a video at this https URL","PeriodicalId":307591,"journal":{"name":"Robotics: Science and Systems XV","volume":"3 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2018-03-03","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"33","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Robotics: Science and Systems XV","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.15607/RSS.2019.XV.005","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}

引用次数: 33

Abstract

Recent work has explored the problem of autonomous navigation by imitating a teacher and learning an end-to-end policy, which directly predicts controls from raw images. However, these approaches tend to be sensitive to mistakes by the teacher and do not scale well to other environments or vehicles. To this end, we propose Observational Imitation Learning (OIL), a novel imitation learning variant that supports online training and automatic selection of optimal behavior by observing multiple imperfect teachers. We apply our proposed methodology to the challenging problems of autonomous driving and UAV racing. For both tasks, we utilize the Sim4CV simulator that enables the generation of large amounts of synthetic training data and also allows for online learning and evaluation. We train a perception network to predict waypoints from raw image data and use OIL to train another network to predict controls from these waypoints. Extensive experiments demonstrate that our trained network outperforms its teachers, conventional imitation learning (IL) and reinforcement learning (RL) baselines and even humans in simulation. The project website is available at this https URL and a video at this https URL

查看原文本刊更多论文

OIL:观察模仿学习

最近的研究通过模仿老师和学习端到端策略来探索自主导航的问题，该策略直接从原始图像中预测控制。然而，这些方法往往对教师的错误很敏感，并且不能很好地扩展到其他环境或工具。为此，我们提出了观察模仿学习(OIL)，这是一种新颖的模仿学习变体，通过观察多个不完美的教师来支持在线培训和自动选择最佳行为。我们将提出的方法应用于自动驾驶和无人机竞赛等具有挑战性的问题。对于这两项任务，我们利用Sim4CV模拟器生成大量综合训练数据，并允许在线学习和评估。我们训练了一个感知网络来从原始图像数据中预测路点，并使用OIL来训练另一个网络来从这些路点预测控制。大量实验表明，我们训练过的网络在模拟中优于其教师、传统模仿学习(IL)和强化学习(RL)基线，甚至优于人类。这个项目的网站是在这个https URL和视频在这个https URL

本文章由计算机程序翻译，如有差异，请以英文原文为准。

求助全文

约1分钟内获得全文求助全文

来源期刊

Robotics: Science and Systems XV

自引率

0.00%

发文量