Learning Model Predictive Controllers with Real-Time Attention for Real-World Navigation

Conference on Robot Learning Pub Date : 2022-09-22 DOI:10.48550/arXiv.2209.10780

Xuesu Xiao, Tingnan Zhang, K. Choromanski, Edward Lee, Anthony Francis, Jacob Varley, Stephen Tu, Sumeet Singh, Peng Xu, Fei Xia, S. M. Persson, Dmitry Kalashnikov, L. Takayama, Roy Frostig, Jie Tan, Carolina Parada, Vikas Sindhwani

{"title":"Learning Model Predictive Controllers with Real-Time Attention for Real-World Navigation","authors":"Xuesu Xiao, Tingnan Zhang, K. Choromanski, Edward Lee, Anthony Francis, Jacob Varley, Stephen Tu, Sumeet Singh, Peng Xu, Fei Xia, S. M. Persson, Dmitry Kalashnikov, L. Takayama, Roy Frostig, Jie Tan, Carolina Parada, Vikas Sindhwani","doi":"10.48550/arXiv.2209.10780","DOIUrl":null,"url":null,"abstract":"Despite decades of research, existing navigation systems still face real-world challenges when deployed in the wild, e.g., in cluttered home environments or in human-occupied public spaces. To address this, we present a new class of implicit control policies combining the benefits of imitation learning with the robust handling of system constraints from Model Predictive Control (MPC). Our approach, called Performer-MPC, uses a learned cost function parameterized by vision context embeddings provided by Performers -- a low-rank implicit-attention Transformer. We jointly train the cost function and construct the controller relying on it, effectively solving end-to-end the corresponding bi-level optimization problem. We show that the resulting policy improves standard MPC performance by leveraging a few expert demonstrations of the desired navigation behavior in different challenging real-world scenarios. Compared with a standard MPC policy, Performer-MPC achieves>40% better goal reached in cluttered environments and>65% better on social metrics when navigating around humans.","PeriodicalId":273870,"journal":{"name":"Conference on Robot Learning","volume":"1 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2022-09-22","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"26","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Conference on Robot Learning","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.48550/arXiv.2209.10780","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}

引用次数: 26

Abstract

Despite decades of research, existing navigation systems still face real-world challenges when deployed in the wild, e.g., in cluttered home environments or in human-occupied public spaces. To address this, we present a new class of implicit control policies combining the benefits of imitation learning with the robust handling of system constraints from Model Predictive Control (MPC). Our approach, called Performer-MPC, uses a learned cost function parameterized by vision context embeddings provided by Performers -- a low-rank implicit-attention Transformer. We jointly train the cost function and construct the controller relying on it, effectively solving end-to-end the corresponding bi-level optimization problem. We show that the resulting policy improves standard MPC performance by leveraging a few expert demonstrations of the desired navigation behavior in different challenging real-world scenarios. Compared with a standard MPC policy, Performer-MPC achieves>40% better goal reached in cluttered environments and>65% better on social metrics when navigating around humans.

查看原文本刊更多论文

现实世界导航中具有实时关注的学习模型预测控制器

尽管经过数十年的研究，现有的导航系统在野外部署时仍然面临着现实世界的挑战，例如，在混乱的家庭环境或人类占据的公共空间中。为了解决这个问题，我们提出了一类新的隐式控制策略，将模仿学习的优点与模型预测控制(MPC)对系统约束的鲁棒处理相结合。我们的方法，称为Performer-MPC，使用由performer提供的视觉上下文嵌入参数化的学习成本函数——一个低级别隐式注意力转换器。我们共同训练代价函数并以此为基础构造控制器，有效地解决了端到端相应的双层优化问题。通过在不同具有挑战性的现实场景中利用一些专家演示所需的导航行为，我们证明了所得到的策略提高了标准MPC性能。与标准MPC策略相比，Performer-MPC在混乱环境中的目标达到了40%以上，在人类周围导航时的社交指标达到了65%以上。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

求助全文

约1分钟内获得全文求助全文

来源期刊

Conference on Robot Learning

自引率

0.00%

发文量