Roll-Drop: accounting for observation noise with a single parameter

Conference on Learning for Dynamics & Control Pub Date : 2023-04-25 DOI:10.48550/arXiv.2304.13150

Luigi Campanaro, D. Martini, Siddhant Gangapurwala, W. Merkt, I. Havoutis

{"title":"Roll-Drop: accounting for observation noise with a single parameter","authors":"Luigi Campanaro, D. Martini, Siddhant Gangapurwala, W. Merkt, I. Havoutis","doi":"10.48550/arXiv.2304.13150","DOIUrl":null,"url":null,"abstract":"This paper proposes a simple strategy for sim-to-real in Deep-Reinforcement Learning (DRL) -- called Roll-Drop -- that uses dropout during simulation to account for observation noise during deployment without explicitly modelling its distribution for each state. DRL is a promising approach to control robots for highly dynamic and feedback-based manoeuvres, and accurate simulators are crucial to providing cheap and abundant data to learn the desired behaviour. Nevertheless, the simulated data are noiseless and generally show a distributional shift that challenges the deployment on real machines where sensor readings are affected by noise. The standard solution is modelling the latter and injecting it during training; while this requires a thorough system identification, Roll-Drop enhances the robustness to sensor noise by tuning only a single parameter. We demonstrate an 80% success rate when up to 25% noise is injected in the observations, with twice higher robustness than the baselines. We deploy the controller trained in simulation on a Unitree A1 platform and assess this improved robustness on the physical system.","PeriodicalId":268449,"journal":{"name":"Conference on Learning for Dynamics & Control","volume":"5 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2023-04-25","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Conference on Learning for Dynamics & Control","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.48550/arXiv.2304.13150","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}

引用次数: 0

Abstract

This paper proposes a simple strategy for sim-to-real in Deep-Reinforcement Learning (DRL) -- called Roll-Drop -- that uses dropout during simulation to account for observation noise during deployment without explicitly modelling its distribution for each state. DRL is a promising approach to control robots for highly dynamic and feedback-based manoeuvres, and accurate simulators are crucial to providing cheap and abundant data to learn the desired behaviour. Nevertheless, the simulated data are noiseless and generally show a distributional shift that challenges the deployment on real machines where sensor readings are affected by noise. The standard solution is modelling the latter and injecting it during training; while this requires a thorough system identification, Roll-Drop enhances the robustness to sensor noise by tuning only a single parameter. We demonstrate an 80% success rate when up to 25% noise is injected in the observations, with twice higher robustness than the baselines. We deploy the controller trained in simulation on a Unitree A1 platform and assess this improved robustness on the physical system.

查看原文本刊更多论文

滚落:单参数计算观测噪声

本文提出了一种在深度强化学习(DRL)中从模拟到真实的简单策略——称为Roll-Drop——该策略在模拟期间使用dropout来解释部署期间的观察噪声，而无需明确地为每个状态建模其分布。DRL是一种很有前途的方法来控制机器人进行高动态和基于反馈的操作，精确的模拟器对于提供廉价和丰富的数据来学习所需的行为至关重要。尽管如此，模拟数据是无噪声的，并且通常显示出分布变化，这对传感器读数受噪声影响的真实机器的部署提出了挑战。标准的解决方案是对后者进行建模，并在训练期间注射;虽然这需要彻底的系统识别，但Roll-Drop仅通过调整单个参数来增强对传感器噪声的鲁棒性。我们证明，当在观测中注入高达25%的噪声时，成功率为80%，鲁棒性比基线高两倍。我们在Unitree A1平台上部署经过仿真训练的控制器，并在物理系统上评估这种改进的鲁棒性。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

求助全文

约1分钟内获得全文求助全文

来源期刊

Conference on Learning for Dynamics & Control

自引率

0.00%

发文量