Mamba Policy: Towards Efficient 3D Diffusion Policy with Hybrid Selective State Models

Jiahang Cao, Qiang Zhang, Jingkai Sun, Jiaxu Wang, Hao Cheng, Yulin Li, Jun Ma, Yecheng Shao, Wen Zhao, Gang Han, Yijie Guo, Renjing Xu
{"title":"Mamba Policy: Towards Efficient 3D Diffusion Policy with Hybrid Selective State Models","authors":"Jiahang Cao, Qiang Zhang, Jingkai Sun, Jiaxu Wang, Hao Cheng, Yulin Li, Jun Ma, Yecheng Shao, Wen Zhao, Gang Han, Yijie Guo, Renjing Xu","doi":"arxiv-2409.07163","DOIUrl":null,"url":null,"abstract":"Diffusion models have been widely employed in the field of 3D manipulation\ndue to their efficient capability to learn distributions, allowing for precise\nprediction of action trajectories. However, diffusion models typically rely on\nlarge parameter UNet backbones as policy networks, which can be challenging to\ndeploy on resource-constrained devices. Recently, the Mamba model has emerged\nas a promising solution for efficient modeling, offering low computational\ncomplexity and strong performance in sequence modeling. In this work, we\npropose the Mamba Policy, a lighter but stronger policy that reduces the\nparameter count by over 80% compared to the original policy network while\nachieving superior performance. Specifically, we introduce the XMamba Block,\nwhich effectively integrates input information with conditional features and\nleverages a combination of Mamba and Attention mechanisms for deep feature\nextraction. Extensive experiments demonstrate that the Mamba Policy excels on\nthe Adroit, Dexart, and MetaWorld datasets, requiring significantly fewer\ncomputational resources. Additionally, we highlight the Mamba Policy's enhanced\nrobustness in long-horizon scenarios compared to baseline methods and explore\nthe performance of various Mamba variants within the Mamba Policy framework.\nOur project page is in https://andycao1125.github.io/mamba_policy/.","PeriodicalId":501031,"journal":{"name":"arXiv - CS - Robotics","volume":null,"pages":null},"PeriodicalIF":0.0000,"publicationDate":"2024-09-11","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"arXiv - CS - Robotics","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/arxiv-2409.07163","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 0

Abstract

Diffusion models have been widely employed in the field of 3D manipulation due to their efficient capability to learn distributions, allowing for precise prediction of action trajectories. However, diffusion models typically rely on large parameter UNet backbones as policy networks, which can be challenging to deploy on resource-constrained devices. Recently, the Mamba model has emerged as a promising solution for efficient modeling, offering low computational complexity and strong performance in sequence modeling. In this work, we propose the Mamba Policy, a lighter but stronger policy that reduces the parameter count by over 80% compared to the original policy network while achieving superior performance. Specifically, we introduce the XMamba Block, which effectively integrates input information with conditional features and leverages a combination of Mamba and Attention mechanisms for deep feature extraction. Extensive experiments demonstrate that the Mamba Policy excels on the Adroit, Dexart, and MetaWorld datasets, requiring significantly fewer computational resources. Additionally, we highlight the Mamba Policy's enhanced robustness in long-horizon scenarios compared to baseline methods and explore the performance of various Mamba variants within the Mamba Policy framework. Our project page is in https://andycao1125.github.io/mamba_policy/.
曼巴政策:利用混合选择性状态模型实现高效的 3D 扩散策略
扩散模型因其高效的分布学习能力而被广泛应用于三维操纵领域,从而可以对行动轨迹进行精确预测。然而,扩散模型通常依赖于作为策略网络的大型参数 UNet 主干网,这对于在资源有限的设备上部署具有挑战性。最近,Mamba 模型作为高效建模的一种有前途的解决方案出现了,它在序列建模方面具有较低的计算复杂度和较强的性能。在这项工作中,我们提出了 Mamba 策略,这是一种更轻便但更强大的策略,与原始策略网络相比,参数数量减少了 80% 以上,同时还实现了卓越的性能。具体来说,我们引入了 XMamba Block,它有效地整合了输入信息和条件特征,并利用 Mamba 和 Attention 机制的组合进行深度特征提取。广泛的实验证明,Mamba 策略在 Adroit、Dexart 和 MetaWorld 数据集上表现出色,所需的计算资源大大减少。此外,与基线方法相比,我们强调了 Mamba Policy 在长视距场景下的增强稳健性,并探索了 Mamba Policy 框架内各种 Mamba 变种的性能。我们的项目页面位于 https://andycao1125.github.io/mamba_policy/。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
求助全文
约1分钟内获得全文 求助全文
来源期刊
自引率
0.00%
发文量
0
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
确定
请完成安全验证×
copy
已复制链接
快去分享给好友吧!
我知道了
右上角分享
点击右上角分享
0
联系我们:info@booksci.cn Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。 Copyright © 2023 布克学术 All rights reserved.
京ICP备2023020795号-1
ghs 京公网安备 11010802042870号
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术官方微信