Vision-Language Model-Based Human-Guided Mobile Robot Navigation in an Unstructured Environment for Human-Centric Smart Manufacturing

IF 10.1 1区 工程技术 Q1 ENGINEERING, MULTIDISCIPLINARY
Tian Wang, Junming Fan, Pai Zheng, Ruqiang Yan, Lihui Wang
{"title":"Vision-Language Model-Based Human-Guided Mobile Robot Navigation in an Unstructured Environment for Human-Centric Smart Manufacturing","authors":"Tian Wang, Junming Fan, Pai Zheng, Ruqiang Yan, Lihui Wang","doi":"10.1016/j.eng.2025.04.028","DOIUrl":null,"url":null,"abstract":"In smart manufacturing, autonomous mobile robots play an indispensable role in conducting inspection and material handling operations, yet they face significant limitations regarding adaptability and resilience within unstructured environments. Vision and language navigation (VLN), a human-guided navigation paradigm, emerges as a compelling solution to these challenges. Nevertheless, VLN’s practical implementation is constrained by limited task generalization capabilities, inadequate response to diverse linguistic commands, and insufficient consideration of sensor-induced noise in environmental perception. This research addresses these limitations by introducing an innovative vision-language model (VLM)-based human-guided mobile robot navigation approach in an unstructured environment for human-centric smart manufacturing (HSM). This approach encompasses robust Three-dimensional (3D) scene reconstruction through advanced point cloud techniques, zero-shot semantic segmentation via a VLM, and natural language processing through a large language model (LLM) to interpret instructions and generate control code for navigation. The system’s efficacy is validated through extensive experiments in an unstructured manufacturing setup.","PeriodicalId":11783,"journal":{"name":"Engineering","volume":"15 1","pages":""},"PeriodicalIF":10.1000,"publicationDate":"2025-07-15","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Engineering","FirstCategoryId":"5","ListUrlMain":"https://doi.org/10.1016/j.eng.2025.04.028","RegionNum":1,"RegionCategory":"工程技术","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q1","JCRName":"ENGINEERING, MULTIDISCIPLINARY","Score":null,"Total":0}
引用次数: 0

Abstract

In smart manufacturing, autonomous mobile robots play an indispensable role in conducting inspection and material handling operations, yet they face significant limitations regarding adaptability and resilience within unstructured environments. Vision and language navigation (VLN), a human-guided navigation paradigm, emerges as a compelling solution to these challenges. Nevertheless, VLN’s practical implementation is constrained by limited task generalization capabilities, inadequate response to diverse linguistic commands, and insufficient consideration of sensor-induced noise in environmental perception. This research addresses these limitations by introducing an innovative vision-language model (VLM)-based human-guided mobile robot navigation approach in an unstructured environment for human-centric smart manufacturing (HSM). This approach encompasses robust Three-dimensional (3D) scene reconstruction through advanced point cloud techniques, zero-shot semantic segmentation via a VLM, and natural language processing through a large language model (LLM) to interpret instructions and generate control code for navigation. The system’s efficacy is validated through extensive experiments in an unstructured manufacturing setup.
面向以人为中心的智能制造的非结构化环境下基于视觉语言模型的人引导移动机器人导航
在智能制造中,自主移动机器人在进行检查和物料搬运操作中发挥着不可或缺的作用,但它们在非结构化环境中的适应性和弹性方面面临着重大限制。视觉和语言导航(VLN)作为一种人类引导的导航范式,成为应对这些挑战的一个令人信服的解决方案。然而,VLN的实际实施受到任务泛化能力有限、对多种语言命令的响应不足以及在环境感知中没有充分考虑传感器引起的噪声的限制。本研究通过在以人为中心的智能制造(HSM)的非结构化环境中引入一种创新的基于视觉语言模型(VLM)的人类引导移动机器人导航方法来解决这些限制。这种方法包括通过先进的点云技术进行强大的三维(3D)场景重建,通过VLM进行零镜头语义分割,以及通过大型语言模型(LLM)进行自然语言处理,以解释指令并生成导航控制代码。该系统的有效性通过非结构化制造装置的大量实验得到验证。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
求助全文
约1分钟内获得全文 求助全文
来源期刊
Engineering
Engineering Environmental Science-Environmental Engineering
自引率
1.60%
发文量
335
审稿时长
35 days
期刊介绍: Engineering, an international open-access journal initiated by the Chinese Academy of Engineering (CAE) in 2015, serves as a distinguished platform for disseminating cutting-edge advancements in engineering R&D, sharing major research outputs, and highlighting key achievements worldwide. The journal's objectives encompass reporting progress in engineering science, fostering discussions on hot topics, addressing areas of interest, challenges, and prospects in engineering development, while considering human and environmental well-being and ethics in engineering. It aims to inspire breakthroughs and innovations with profound economic and social significance, propelling them to advanced international standards and transforming them into a new productive force. Ultimately, this endeavor seeks to bring about positive changes globally, benefit humanity, and shape a new future.
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
确定
请完成安全验证×
copy
已复制链接
快去分享给好友吧!
我知道了
右上角分享
点击右上角分享
0
联系我们:info@booksci.cn Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。 Copyright © 2023 布克学术 All rights reserved.
京ICP备2023020795号-1
ghs 京公网安备 11010802042870号
Book学术文献互助
Book学术文献互助群
群 号:604180095
Book学术官方微信