基于模型预测轨迹优化的语言驱动闭环抓取

IF 3.1 3区 计算机科学 Q2 AUTOMATION & CONTROL SYSTEMS
H.H. Nguyen , M.N. Vu , F. Beck , G. Ebmer , A. Nguyen , W. Kemmetmueller , A. Kugi
{"title":"基于模型预测轨迹优化的语言驱动闭环抓取","authors":"H.H. Nguyen ,&nbsp;M.N. Vu ,&nbsp;F. Beck ,&nbsp;G. Ebmer ,&nbsp;A. Nguyen ,&nbsp;W. Kemmetmueller ,&nbsp;A. Kugi","doi":"10.1016/j.mechatronics.2025.103335","DOIUrl":null,"url":null,"abstract":"<div><div>Combining a vision module inside a closed-loop control system for the <em>seamless movement</em> of a robot in a manipulation task is challenging due to the inconsistent update rates between utilized modules. This task is even more difficult in a dynamic environment, e.g., objects are moving. This paper presents a <em>modular</em> zero-shot framework for language-driven manipulation of (dynamic) objects through a closed-loop control system with real-time trajectory replanning and an online 6D object pose localization. We segment an object within <span><math><mrow><mtext>0.5</mtext><mspace></mspace><mtext>s</mtext></mrow></math></span> by leveraging a vision language model via language commands. Then, guided by natural language commands, a closed-loop system, including a unified pose estimation and tracking and online trajectory planning, is utilized to continuously track this object and compute the optimal trajectory in real time. Our proposed zero-shot framework provides a smooth trajectory that avoids jerky movements and ensures the robot can grasp a non-stationary object. Experimental results demonstrate the real-time capability of the proposed zero-shot modular framework to accurately and efficiently grasp moving objects. The framework achieves update rates of up to 30<!--> <!-->Hz for the online 6D pose localization module and 10<!--> <!-->Hz for the receding-horizon trajectory optimization. These advantages highlight the modular framework’s potential applications in robotics and human–robot interaction; see the video at <span><span>language-driven-grasping.github.io</span><svg><path></path></svg></span>.</div></div>","PeriodicalId":49842,"journal":{"name":"Mechatronics","volume":"109 ","pages":"Article 103335"},"PeriodicalIF":3.1000,"publicationDate":"2025-05-12","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":"{\"title\":\"Language-driven closed-loop grasping with model-predictive trajectory optimization\",\"authors\":\"H.H. Nguyen ,&nbsp;M.N. Vu ,&nbsp;F. Beck ,&nbsp;G. Ebmer ,&nbsp;A. Nguyen ,&nbsp;W. Kemmetmueller ,&nbsp;A. Kugi\",\"doi\":\"10.1016/j.mechatronics.2025.103335\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"<div><div>Combining a vision module inside a closed-loop control system for the <em>seamless movement</em> of a robot in a manipulation task is challenging due to the inconsistent update rates between utilized modules. This task is even more difficult in a dynamic environment, e.g., objects are moving. This paper presents a <em>modular</em> zero-shot framework for language-driven manipulation of (dynamic) objects through a closed-loop control system with real-time trajectory replanning and an online 6D object pose localization. We segment an object within <span><math><mrow><mtext>0.5</mtext><mspace></mspace><mtext>s</mtext></mrow></math></span> by leveraging a vision language model via language commands. Then, guided by natural language commands, a closed-loop system, including a unified pose estimation and tracking and online trajectory planning, is utilized to continuously track this object and compute the optimal trajectory in real time. Our proposed zero-shot framework provides a smooth trajectory that avoids jerky movements and ensures the robot can grasp a non-stationary object. Experimental results demonstrate the real-time capability of the proposed zero-shot modular framework to accurately and efficiently grasp moving objects. The framework achieves update rates of up to 30<!--> <!-->Hz for the online 6D pose localization module and 10<!--> <!-->Hz for the receding-horizon trajectory optimization. These advantages highlight the modular framework’s potential applications in robotics and human–robot interaction; see the video at <span><span>language-driven-grasping.github.io</span><svg><path></path></svg></span>.</div></div>\",\"PeriodicalId\":49842,\"journal\":{\"name\":\"Mechatronics\",\"volume\":\"109 \",\"pages\":\"Article 103335\"},\"PeriodicalIF\":3.1000,\"publicationDate\":\"2025-05-12\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"0\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"Mechatronics\",\"FirstCategoryId\":\"94\",\"ListUrlMain\":\"https://www.sciencedirect.com/science/article/pii/S0957415825000443\",\"RegionNum\":3,\"RegionCategory\":\"计算机科学\",\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"Q2\",\"JCRName\":\"AUTOMATION & CONTROL SYSTEMS\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"Mechatronics","FirstCategoryId":"94","ListUrlMain":"https://www.sciencedirect.com/science/article/pii/S0957415825000443","RegionNum":3,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q2","JCRName":"AUTOMATION & CONTROL SYSTEMS","Score":null,"Total":0}
引用次数: 0

摘要

由于所使用的模块之间的更新速率不一致,在闭环控制系统中结合视觉模块以实现机器人在操作任务中的无缝运动是具有挑战性的。这个任务在动态环境中更加困难,例如,物体在移动。本文提出了一种模块化的零射击框架,通过具有实时轨迹重规划和在线6D物体姿态定位的闭环控制系统,用于语言驱动的(动态)物体操纵。我们通过语言命令利用视觉语言模型在0.5s内分割对象。然后,在自然语言命令的引导下,利用统一姿态估计跟踪和在线轨迹规划的闭环系统对该目标进行连续跟踪,实时计算出最优轨迹。我们提出的零射击框架提供了一个平滑的轨迹,避免了突然的运动,并确保机器人可以抓住一个非静止的物体。实验结果表明,所提出的零射击模块化框架具有准确、高效抓取运动目标的实时性。该框架实现了在线6D位姿定位模块的更新率高达30 Hz,后退地平线轨迹优化的更新率高达10 Hz。这些优点突出了模块化框架在机器人技术和人机交互方面的潜在应用;参见language-driven-grasp .github.io上的视频。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
Language-driven closed-loop grasping with model-predictive trajectory optimization
Combining a vision module inside a closed-loop control system for the seamless movement of a robot in a manipulation task is challenging due to the inconsistent update rates between utilized modules. This task is even more difficult in a dynamic environment, e.g., objects are moving. This paper presents a modular zero-shot framework for language-driven manipulation of (dynamic) objects through a closed-loop control system with real-time trajectory replanning and an online 6D object pose localization. We segment an object within 0.5s by leveraging a vision language model via language commands. Then, guided by natural language commands, a closed-loop system, including a unified pose estimation and tracking and online trajectory planning, is utilized to continuously track this object and compute the optimal trajectory in real time. Our proposed zero-shot framework provides a smooth trajectory that avoids jerky movements and ensures the robot can grasp a non-stationary object. Experimental results demonstrate the real-time capability of the proposed zero-shot modular framework to accurately and efficiently grasp moving objects. The framework achieves update rates of up to 30 Hz for the online 6D pose localization module and 10 Hz for the receding-horizon trajectory optimization. These advantages highlight the modular framework’s potential applications in robotics and human–robot interaction; see the video at language-driven-grasping.github.io.
求助全文
通过发布文献求助,成功后即可免费获取论文全文。 去求助
来源期刊
Mechatronics
Mechatronics 工程技术-工程:电子与电气
CiteScore
5.90
自引率
9.10%
发文量
0
审稿时长
109 days
期刊介绍: Mechatronics is the synergistic combination of precision mechanical engineering, electronic control and systems thinking in the design of products and manufacturing processes. It relates to the design of systems, devices and products aimed at achieving an optimal balance between basic mechanical structure and its overall control. The purpose of this journal is to provide rapid publication of topical papers featuring practical developments in mechatronics. It will cover a wide range of application areas including consumer product design, instrumentation, manufacturing methods, computer integration and process and device control, and will attract a readership from across the industrial and academic research spectrum. Particular importance will be attached to aspects of innovation in mechatronics design philosophy which illustrate the benefits obtainable by an a priori integration of functionality with embedded microprocessor control. A major item will be the design of machines, devices and systems possessing a degree of computer based intelligence. The journal seeks to publish research progress in this field with an emphasis on the applied rather than the theoretical. It will also serve the dual role of bringing greater recognition to this important area of engineering.
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
确定
请完成安全验证×
copy
已复制链接
快去分享给好友吧!
我知道了
右上角分享
点击右上角分享
0
联系我们:info@booksci.cn Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。 Copyright © 2023 布克学术 All rights reserved.
京ICP备2023020795号-1
ghs 京公网安备 11010802042870号
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术官方微信