Risk-Calibrated Human-Robot Interaction via Set-Valued Intent Prediction

arXiv - CS - Systems and Control Pub Date : 2024-03-23 DOI:arxiv-2403.15959

Justin Lidard, Hang Pham, Ariel Bachman, Bryan Boateng, Anirudha Majumdar

{"title":"Risk-Calibrated Human-Robot Interaction via Set-Valued Intent Prediction","authors":"Justin Lidard, Hang Pham, Ariel Bachman, Bryan Boateng, Anirudha Majumdar","doi":"arxiv-2403.15959","DOIUrl":null,"url":null,"abstract":"Tasks where robots must cooperate with humans, such as navigating around a\ncluttered home or sorting everyday items, are challenging because they exhibit\na wide range of valid actions that lead to similar outcomes. Moreover,\nzero-shot cooperation between human-robot partners is an especially challenging\nproblem because it requires the robot to infer and adapt on the fly to a latent\nhuman intent, which could vary significantly from human to human. Recently,\ndeep learned motion prediction models have shown promising results in\npredicting human intent but are prone to being confidently incorrect. In this\nwork, we present Risk-Calibrated Interactive Planning (RCIP), which is a\nframework for measuring and calibrating risk associated with uncertain action\nselection in human-robot cooperation, with the fundamental idea that the robot\nshould ask for human clarification when the risk associated with the\nuncertainty in the human's intent cannot be controlled. RCIP builds on the\ntheory of set-valued risk calibration to provide a finite-sample statistical\nguarantee on the cumulative loss incurred by the robot while minimizing the\ncost of human clarification in complex multi-step settings. Our main insight is\nto frame the risk control problem as a sequence-level multi-hypothesis testing\nproblem, allowing efficient calibration using a low-dimensional parameter that\ncontrols a pre-trained risk-aware policy. Experiments across a variety of\nsimulated and real-world environments demonstrate RCIP's ability to predict and\nadapt to a diverse set of dynamic human intents.","PeriodicalId":501062,"journal":{"name":"arXiv - CS - Systems and Control","volume":"36 1","pages":""},"PeriodicalIF":0.0000,"publicationDate":"2024-03-23","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"arXiv - CS - Systems and Control","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/arxiv-2403.15959","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}

引用次数: 0

Abstract

Tasks where robots must cooperate with humans, such as navigating around a cluttered home or sorting everyday items, are challenging because they exhibit a wide range of valid actions that lead to similar outcomes. Moreover, zero-shot cooperation between human-robot partners is an especially challenging problem because it requires the robot to infer and adapt on the fly to a latent human intent, which could vary significantly from human to human. Recently, deep learned motion prediction models have shown promising results in predicting human intent but are prone to being confidently incorrect. In this work, we present Risk-Calibrated Interactive Planning (RCIP), which is a framework for measuring and calibrating risk associated with uncertain action selection in human-robot cooperation, with the fundamental idea that the robot should ask for human clarification when the risk associated with the uncertainty in the human's intent cannot be controlled. RCIP builds on the theory of set-valued risk calibration to provide a finite-sample statistical guarantee on the cumulative loss incurred by the robot while minimizing the cost of human clarification in complex multi-step settings. Our main insight is to frame the risk control problem as a sequence-level multi-hypothesis testing problem, allowing efficient calibration using a low-dimensional parameter that controls a pre-trained risk-aware policy. Experiments across a variety of simulated and real-world environments demonstrate RCIP's ability to predict and adapt to a diverse set of dynamic human intents.

查看原文本刊更多论文

通过集值意图预测进行风险校准的人机交互

机器人必须与人类合作的任务，例如在杂乱无章的家中导航或对日常用品进行分类，都是极具挑战性的任务，因为这些任务中会出现大量导致类似结果的有效行动。此外，人与机器人伙伴之间的零点合作尤其具有挑战性，因为它要求机器人在飞行中推断并适应人类的潜在意图，而这种意图可能因人而异。最近，深度学习运动预测模型在预测人类意图方面取得了可喜的成果，但容易出现自信错误。在这项工作中，我们提出了风险校准交互式规划（RCIP），这是一个用于测量和校准人机合作中不确定行动选择相关风险的框架，其基本思想是，当与人类意图不确定性相关的风险无法控制时，机器人应请求人类澄清。RCIP 建立在集值风险校准理论的基础上，为机器人造成的累积损失提供有限样本统计保证，同时在复杂的多步骤设置中将人类澄清的成本降至最低。我们的主要见解是将风险控制问题构建为序列级多假设检验问题，从而允许使用控制预训练风险感知策略的低维参数进行高效校准。在各种模拟和真实环境中进行的实验证明，RCIP 能够预测和适应人类的各种动态意图。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

求助全文

约1分钟内获得全文求助全文

来源期刊

arXiv - CS - Systems and Control

自引率

0.00%

发文量