Mamba Policy: Towards Efficient 3D Diffusion Policy with Hybrid Selective State Models

arXiv - CS - Robotics Pub Date : 2024-09-11 DOI:arxiv-2409.07163

Jiahang Cao, Qiang Zhang, Jingkai Sun, Jiaxu Wang, Hao Cheng, Yulin Li, Jun Ma, Yecheng Shao, Wen Zhao, Gang Han, Yijie Guo, Renjing Xu

{"title":"Mamba Policy: Towards Efficient 3D Diffusion Policy with Hybrid Selective State Models","authors":"Jiahang Cao, Qiang Zhang, Jingkai Sun, Jiaxu Wang, Hao Cheng, Yulin Li, Jun Ma, Yecheng Shao, Wen Zhao, Gang Han, Yijie Guo, Renjing Xu","doi":"arxiv-2409.07163","DOIUrl":null,"url":null,"abstract":"Diffusion models have been widely employed in the field of 3D manipulation\ndue to their efficient capability to learn distributions, allowing for precise\nprediction of action trajectories. However, diffusion models typically rely on\nlarge parameter UNet backbones as policy networks, which can be challenging to\ndeploy on resource-constrained devices. Recently, the Mamba model has emerged\nas a promising solution for efficient modeling, offering low computational\ncomplexity and strong performance in sequence modeling. In this work, we\npropose the Mamba Policy, a lighter but stronger policy that reduces the\nparameter count by over 80% compared to the original policy network while\nachieving superior performance. Specifically, we introduce the XMamba Block,\nwhich effectively integrates input information with conditional features and\nleverages a combination of Mamba and Attention mechanisms for deep feature\nextraction. Extensive experiments demonstrate that the Mamba Policy excels on\nthe Adroit, Dexart, and MetaWorld datasets, requiring significantly fewer\ncomputational resources. Additionally, we highlight the Mamba Policy's enhanced\nrobustness in long-horizon scenarios compared to baseline methods and explore\nthe performance of various Mamba variants within the Mamba Policy framework.\nOur project page is in https://andycao1125.github.io/mamba_policy/.","PeriodicalId":501031,"journal":{"name":"arXiv - CS - Robotics","volume":"13 1","pages":""},"PeriodicalIF":0.0000,"publicationDate":"2024-09-11","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"arXiv - CS - Robotics","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/arxiv-2409.07163","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}

引用次数: 0

Abstract

Diffusion models have been widely employed in the field of 3D manipulation due to their efficient capability to learn distributions, allowing for precise prediction of action trajectories. However, diffusion models typically rely on large parameter UNet backbones as policy networks, which can be challenging to deploy on resource-constrained devices. Recently, the Mamba model has emerged as a promising solution for efficient modeling, offering low computational complexity and strong performance in sequence modeling. In this work, we propose the Mamba Policy, a lighter but stronger policy that reduces the parameter count by over 80% compared to the original policy network while achieving superior performance. Specifically, we introduce the XMamba Block, which effectively integrates input information with conditional features and leverages a combination of Mamba and Attention mechanisms for deep feature extraction. Extensive experiments demonstrate that the Mamba Policy excels on the Adroit, Dexart, and MetaWorld datasets, requiring significantly fewer computational resources. Additionally, we highlight the Mamba Policy's enhanced robustness in long-horizon scenarios compared to baseline methods and explore the performance of various Mamba variants within the Mamba Policy framework. Our project page is in https://andycao1125.github.io/mamba_policy/.

查看原文本刊更多论文

曼巴政策：利用混合选择性状态模型实现高效的 3D 扩散策略

扩散模型因其高效的分布学习能力而被广泛应用于三维操纵领域，从而可以对行动轨迹进行精确预测。然而，扩散模型通常依赖于作为策略网络的大型参数 UNet 主干网，这对于在资源有限的设备上部署具有挑战性。最近，Mamba 模型作为高效建模的一种有前途的解决方案出现了，它在序列建模方面具有较低的计算复杂度和较强的性能。在这项工作中，我们提出了 Mamba 策略，这是一种更轻便但更强大的策略，与原始策略网络相比，参数数量减少了 80% 以上，同时还实现了卓越的性能。具体来说，我们引入了 XMamba Block，它有效地整合了输入信息和条件特征，并利用 Mamba 和 Attention 机制的组合进行深度特征提取。广泛的实验证明，Mamba 策略在 Adroit、Dexart 和 MetaWorld 数据集上表现出色，所需的计算资源大大减少。此外，与基线方法相比，我们强调了 Mamba Policy 在长视距场景下的增强稳健性，并探索了 Mamba Policy 框架内各种 Mamba 变种的性能。我们的项目页面位于 https://andycao1125.github.io/mamba_policy/。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

求助全文

约1分钟内获得全文求助全文

来源期刊

arXiv - CS - Robotics

自引率

0.00%

发文量