SAIL: Simulation-Informed Active In-the-Wild Learning

Elaine Schaertl Short, Adam Allevato, A. Thomaz
{"title":"SAIL: Simulation-Informed Active In-the-Wild Learning","authors":"Elaine Schaertl Short, Adam Allevato, A. Thomaz","doi":"10.1109/HRI.2019.8673019","DOIUrl":null,"url":null,"abstract":"Robots in real-world environments may need to adapt context-specific behaviors learned in one environment to new environments with new constraints. In many cases, copresent humans can provide the robot with information, but it may not be safe for them to provide hands-on demonstrations and there may not be a dedicated supervisor to provide constant feedback. In this work we present the SAIL (Simulation-Informed Active In-the-Wild Learning) algorithm for learning new approaches to manipulation skills starting from a single demonstration. In this three-step algorithm, the robot simulates task execution to choose new potential approaches; collects unsupervised data on task execution in the target environment; and finally, chooses informative actions to show to co-present humans and obtain labels. Our approach enables a robot to learn new ways of executing two different tasks by using success/failure labels obtained from naïve users in a public space, performing 496 manipulation actions and collecting 163 labels from users in the wild over six 45-minute to 1-hour deployments. We show that classifiers based low-level sensor data can be used to accurately distinguish between successful and unsuccessful motions in a multi-step task ($\\mathbf{p} < 0.005$), even when trained in the wild. We also show that using the sensor data to choose which actions to sample is more effective than choosing the least-sampled action.","PeriodicalId":6600,"journal":{"name":"2019 14th ACM/IEEE International Conference on Human-Robot Interaction (HRI)","volume":"8 1","pages":"468-477"},"PeriodicalIF":0.0000,"publicationDate":"2019-03-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"6","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"2019 14th ACM/IEEE International Conference on Human-Robot Interaction (HRI)","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/HRI.2019.8673019","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 6

Abstract

Robots in real-world environments may need to adapt context-specific behaviors learned in one environment to new environments with new constraints. In many cases, copresent humans can provide the robot with information, but it may not be safe for them to provide hands-on demonstrations and there may not be a dedicated supervisor to provide constant feedback. In this work we present the SAIL (Simulation-Informed Active In-the-Wild Learning) algorithm for learning new approaches to manipulation skills starting from a single demonstration. In this three-step algorithm, the robot simulates task execution to choose new potential approaches; collects unsupervised data on task execution in the target environment; and finally, chooses informative actions to show to co-present humans and obtain labels. Our approach enables a robot to learn new ways of executing two different tasks by using success/failure labels obtained from naïve users in a public space, performing 496 manipulation actions and collecting 163 labels from users in the wild over six 45-minute to 1-hour deployments. We show that classifiers based low-level sensor data can be used to accurately distinguish between successful and unsuccessful motions in a multi-step task ($\mathbf{p} < 0.005$), even when trained in the wild. We also show that using the sensor data to choose which actions to sample is more effective than choosing the least-sampled action.
SAIL:模拟信息主动野外学习
现实环境中的机器人可能需要将在一个环境中学习的特定于上下文的行为适应具有新约束的新环境。在许多情况下,在场的人可以为机器人提供信息,但他们提供实际演示可能不安全,而且可能没有专门的主管提供持续的反馈。在这项工作中,我们提出了SAIL(模拟通知主动野外学习)算法,用于从单个演示开始学习操作技能的新方法。在该算法中,机器人通过模拟任务执行来选择新的潜在路径;收集目标环境中任务执行的无监督数据;最后,选择信息动作展示给共同呈现的人类并获得标签。我们的方法使机器人能够学习执行两种不同任务的新方法,通过在公共空间中使用从naïve用户获得的成功/失败标签,执行496个操作动作,并在六次45分钟到1小时的部署中从用户收集163个标签。我们表明,即使在野外训练时,基于低级传感器数据的分类器也可以用于准确区分多步骤任务中成功和不成功的运动($\mathbf{p} < 0.005$)。我们还表明,使用传感器数据来选择要采样的动作比选择采样最少的动作更有效。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
求助全文
约1分钟内获得全文 求助全文
来源期刊
自引率
0.00%
发文量
0
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
确定
请完成安全验证×
copy
已复制链接
快去分享给好友吧!
我知道了
右上角分享
点击右上角分享
0
联系我们:info@booksci.cn Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。 Copyright © 2023 布克学术 All rights reserved.
京ICP备2023020795号-1
ghs 京公网安备 11010802042870号
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术官方微信