USN：针对各种动作噪声的鲁棒模仿学习方法

IF 5.4 3区材料科学 Q2 CHEMISTRY, PHYSICAL

ACS Applied Energy Materials Pub Date : 2024-04-21 DOI:10.1613/jair.1.15819

Xingrui Yu, Bo Han, I. Tsang

{"title":"USN：针对各种动作噪声的鲁棒模仿学习方法","authors":"Xingrui Yu, Bo Han, I. Tsang","doi":"10.1613/jair.1.15819","DOIUrl":null,"url":null,"abstract":"Learning from imperfect demonstrations is a crucial challenge in imitation learning (IL). Unlike existing works that still rely on the enormous effort of expert demonstrators, we consider a more cost-effective option for obtaining a large number of demonstrations. That is, hire annotators to label actions for existing image records in realistic scenarios. However, action noise can occur when annotators are not domain experts or encounter confusing states. In this work, we introduce two particular forms of action noise, i.e., state-independent and state-dependent action noise. Previous IL methods fail to achieve expert-level performance when the demonstrations contain action noise, especially the state-dependent action noise. To mitigate the harmful effects of action noises, we propose a robust learning paradigm called USN (Uncertainty-aware Sample-selection with Negative learning). The model first estimates the predictive uncertainty for all demonstration data and then selects sampleswith high loss based on the uncertainty measures. Finally, it updates the model parameters with additional negative learning on the selected samples. Empirical results in Box2D tasks and Atari games show that USN consistently improves the final rewards of behavioral cloning, online imitation learning, and offline imitation learning methods under various action noises. The ratio of significant improvements is up to 94.44%. Moreover, our method scales to conditional imitation learning with real-world noisy commands in urban driving","PeriodicalId":4,"journal":{"name":"ACS Applied Energy Materials","volume":"112 7","pages":""},"PeriodicalIF":5.4000,"publicationDate":"2024-04-21","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":"{\"title\":\"USN: A Robust Imitation Learning Method against Diverse Action Noise\",\"authors\":\"Xingrui Yu, Bo Han, I. Tsang\",\"doi\":\"10.1613/jair.1.15819\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"Learning from imperfect demonstrations is a crucial challenge in imitation learning (IL). Unlike existing works that still rely on the enormous effort of expert demonstrators, we consider a more cost-effective option for obtaining a large number of demonstrations. That is, hire annotators to label actions for existing image records in realistic scenarios. However, action noise can occur when annotators are not domain experts or encounter confusing states. In this work, we introduce two particular forms of action noise, i.e., state-independent and state-dependent action noise. Previous IL methods fail to achieve expert-level performance when the demonstrations contain action noise, especially the state-dependent action noise. To mitigate the harmful effects of action noises, we propose a robust learning paradigm called USN (Uncertainty-aware Sample-selection with Negative learning). The model first estimates the predictive uncertainty for all demonstration data and then selects sampleswith high loss based on the uncertainty measures. Finally, it updates the model parameters with additional negative learning on the selected samples. Empirical results in Box2D tasks and Atari games show that USN consistently improves the final rewards of behavioral cloning, online imitation learning, and offline imitation learning methods under various action noises. The ratio of significant improvements is up to 94.44%. Moreover, our method scales to conditional imitation learning with real-world noisy commands in urban driving\",\"PeriodicalId\":4,\"journal\":{\"name\":\"ACS Applied Energy Materials\",\"volume\":\"112 7\",\"pages\":\"\"},\"PeriodicalIF\":5.4000,\"publicationDate\":\"2024-04-21\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"0\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"ACS Applied Energy Materials\",\"FirstCategoryId\":\"94\",\"ListUrlMain\":\"https://doi.org/10.1613/jair.1.15819\",\"RegionNum\":3,\"RegionCategory\":\"材料科学\",\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"Q2\",\"JCRName\":\"CHEMISTRY, PHYSICAL\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"ACS Applied Energy Materials","FirstCategoryId":"94","ListUrlMain":"https://doi.org/10.1613/jair.1.15819","RegionNum":3,"RegionCategory":"材料科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q2","JCRName":"CHEMISTRY, PHYSICAL","Score":null,"Total":0}

引用次数: 0

摘要

从不尽人意的示范中学习是模仿学习（IL）的一个重要挑战。现有的研究仍依赖于专家示范者的巨大努力，与此不同，我们考虑采用一种更具成本效益的方法来获取大量示范。也就是说，在现实场景中聘请标注者为现有图像记录标注动作。然而，当标注者不是领域专家或遇到混乱状态时，就会出现动作噪声。在这项工作中，我们引入了两种特殊形式的动作噪声，即与状态无关的动作噪声和与状态有关的动作噪声。以前的 IL 方法在演示包含动作噪声（尤其是与状态相关的动作噪声）时无法达到专家级性能。为了减轻动作噪声的有害影响，我们提出了一种称为 USN（带有负学习的不确定性感知样本选择）的稳健学习范式。该模型首先估计所有演示数据的预测不确定性，然后根据不确定性度量选择损失较大的样本。最后，通过对所选样本进行额外的负向学习来更新模型参数。在 Box2D 任务和 Atari 游戏中的实证结果表明，USN 能够在各种动作噪声下持续改进行为克隆、在线模仿学习和离线模仿学习方法的最终奖励。显著提高的比例高达 94.44%。此外，我们的方法还可扩展到城市驾驶中真实世界噪声指令的条件模仿学习。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

查看原文本刊更多论文

USN: A Robust Imitation Learning Method against Diverse Action Noise

Learning from imperfect demonstrations is a crucial challenge in imitation learning (IL). Unlike existing works that still rely on the enormous effort of expert demonstrators, we consider a more cost-effective option for obtaining a large number of demonstrations. That is, hire annotators to label actions for existing image records in realistic scenarios. However, action noise can occur when annotators are not domain experts or encounter confusing states. In this work, we introduce two particular forms of action noise, i.e., state-independent and state-dependent action noise. Previous IL methods fail to achieve expert-level performance when the demonstrations contain action noise, especially the state-dependent action noise. To mitigate the harmful effects of action noises, we propose a robust learning paradigm called USN (Uncertainty-aware Sample-selection with Negative learning). The model first estimates the predictive uncertainty for all demonstration data and then selects sampleswith high loss based on the uncertainty measures. Finally, it updates the model parameters with additional negative learning on the selected samples. Empirical results in Box2D tasks and Atari games show that USN consistently improves the final rewards of behavioral cloning, online imitation learning, and offline imitation learning methods under various action noises. The ratio of significant improvements is up to 94.44%. Moreover, our method scales to conditional imitation learning with real-world noisy commands in urban driving

求助全文

通过发布文献求助，成功后即可免费获取论文全文。去求助

来源期刊

ACS Applied Energy Materials Materials Science-Materials Chemistry

CiteScore

10.30

自引率

6.20%

发文量

1368

期刊介绍： ACS Applied Energy Materials is an interdisciplinary journal publishing original research covering all aspects of materials, engineering, chemistry, physics and biology relevant to energy conversion and storage. The journal is devoted to reports of new and original experimental and theoretical research of an applied nature that integrate knowledge in the areas of materials, engineering, physics, bioscience, and chemistry into important energy applications.

USN： 针对各种动作噪声的鲁棒模仿学习方法

摘要

USN：针对各种动作噪声的鲁棒模仿学习方法