DoShiCo challenge: Domain shift in control prediction

2018 IEEE International Conference on Simulation, Modeling, and Programming for Autonomous Robots (SIMPAR) Pub Date : 2017-10-26 DOI:10.1109/SIMPAR.2018.8376264

Klaas Kelchtermans, T. Tuytelaars

{"title":"DoShiCo challenge: Domain shift in control prediction","authors":"Klaas Kelchtermans, T. Tuytelaars","doi":"10.1109/SIMPAR.2018.8376264","DOIUrl":null,"url":null,"abstract":"Training deep neural network policies end-to-end for real-world applications so far requires big demonstration datasets in the real world or big sets consisting of a large variety of realistic and closely related 3D CAD models. These real or virtual data should, moreover, have very similar characteristics to the conditions expected at test time. These stringent requirements and the time consuming data collection processes that they entail, are currently the most important impediment that keeps deep reinforcement learning from being deployed in real-world applications. Therefore, in this work we advocate an alternative approach, where instead of avoiding any domain shift by carefully selecting the training data, the goal is to learn a policy that can cope with it. To this end, we propose the DoShiCo challenge: to train a model in very basic synthetic environments, far from realistic, in a way that it can be applied in more realistic environments as well as take the control decisions on real-world data. In particular, we focus on the task of collision avoidance for drones. We created a set of simulated environments that can be used as benchmark and implemented a baseline method, exploiting depth prediction as an auxiliary task to help overcome the domain shift. Even though the policy is trained in very basic environments, it can learn to fly without collisions in a very different realistic simulated environment. Of course several benchmarks for reinforcement learning already exist — but they never include a large domain shift. On the other hand, several benchmarks in computer vision focus on the domain shift, but they take the form of a static datasets instead of simulated environments. In this work we claim that it is crucial to take the two challenges together in one benchmark.","PeriodicalId":156498,"journal":{"name":"2018 IEEE International Conference on Simulation, Modeling, and Programming for Autonomous Robots (SIMPAR)","volume":"67 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2017-10-26","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"1","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"2018 IEEE International Conference on Simulation, Modeling, and Programming for Autonomous Robots (SIMPAR)","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/SIMPAR.2018.8376264","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}

引用次数: 1

Abstract

Training deep neural network policies end-to-end for real-world applications so far requires big demonstration datasets in the real world or big sets consisting of a large variety of realistic and closely related 3D CAD models. These real or virtual data should, moreover, have very similar characteristics to the conditions expected at test time. These stringent requirements and the time consuming data collection processes that they entail, are currently the most important impediment that keeps deep reinforcement learning from being deployed in real-world applications. Therefore, in this work we advocate an alternative approach, where instead of avoiding any domain shift by carefully selecting the training data, the goal is to learn a policy that can cope with it. To this end, we propose the DoShiCo challenge: to train a model in very basic synthetic environments, far from realistic, in a way that it can be applied in more realistic environments as well as take the control decisions on real-world data. In particular, we focus on the task of collision avoidance for drones. We created a set of simulated environments that can be used as benchmark and implemented a baseline method, exploiting depth prediction as an auxiliary task to help overcome the domain shift. Even though the policy is trained in very basic environments, it can learn to fly without collisions in a very different realistic simulated environment. Of course several benchmarks for reinforcement learning already exist — but they never include a large domain shift. On the other hand, several benchmarks in computer vision focus on the domain shift, but they take the form of a static datasets instead of simulated environments. In this work we claim that it is crucial to take the two challenges together in one benchmark.

查看原文本刊更多论文

DoShiCo挑战:控制预测中的领域转移

到目前为止，为现实世界的应用训练端到端的深度神经网络策略需要现实世界中的大型演示数据集或由各种现实且密切相关的3D CAD模型组成的大型数据集。此外，这些真实或虚拟数据应该与测试时预期的条件具有非常相似的特征。这些严格的要求和耗时的数据收集过程是目前阻碍深度强化学习在实际应用中部署的最重要的障碍。因此，在这项工作中，我们提倡一种替代方法，而不是通过仔细选择训练数据来避免任何领域转移，目标是学习一个可以应对它的策略。为此，我们提出了DoShiCo挑战:在非常基本的合成环境中训练模型，远离现实，以一种可以应用于更现实环境的方式，并对现实世界的数据进行控制决策。我们特别关注无人机的避碰任务。我们创建了一组可用作基准的模拟环境，并实现了基线方法，利用深度预测作为辅助任务来帮助克服域转移。尽管该策略是在非常基本的环境中训练的，但它可以在非常不同的现实模拟环境中学习无碰撞飞行。当然，强化学习的几个基准已经存在，但它们从来没有包括大的领域转移。另一方面，计算机视觉中的一些基准测试关注的是领域转移，但它们采用静态数据集的形式，而不是模拟环境。在这项工作中，我们声称将这两个挑战放在一个基准中是至关重要的。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

求助全文

约1分钟内获得全文求助全文

来源期刊

2018 IEEE International Conference on Simulation, Modeling, and Programming for Autonomous Robots (SIMPAR)

自引率

0.00%

发文量