Tian Wang, Junming Fan, Pai Zheng, Ruqiang Yan, Lihui Wang
{"title":"面向以人为中心的智能制造的非结构化环境下基于视觉语言模型的人引导移动机器人导航","authors":"Tian Wang, Junming Fan, Pai Zheng, Ruqiang Yan, Lihui Wang","doi":"10.1016/j.eng.2025.04.028","DOIUrl":null,"url":null,"abstract":"In smart manufacturing, autonomous mobile robots play an indispensable role in conducting inspection and material handling operations, yet they face significant limitations regarding adaptability and resilience within unstructured environments. Vision and language navigation (VLN), a human-guided navigation paradigm, emerges as a compelling solution to these challenges. Nevertheless, VLN’s practical implementation is constrained by limited task generalization capabilities, inadequate response to diverse linguistic commands, and insufficient consideration of sensor-induced noise in environmental perception. This research addresses these limitations by introducing an innovative vision-language model (VLM)-based human-guided mobile robot navigation approach in an unstructured environment for human-centric smart manufacturing (HSM). This approach encompasses robust Three-dimensional (3D) scene reconstruction through advanced point cloud techniques, zero-shot semantic segmentation via a VLM, and natural language processing through a large language model (LLM) to interpret instructions and generate control code for navigation. The system’s efficacy is validated through extensive experiments in an unstructured manufacturing setup.","PeriodicalId":11783,"journal":{"name":"Engineering","volume":"15 1","pages":""},"PeriodicalIF":10.1000,"publicationDate":"2025-07-15","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":"{\"title\":\"Vision-Language Model-Based Human-Guided Mobile Robot Navigation in an Unstructured Environment for Human-Centric Smart Manufacturing\",\"authors\":\"Tian Wang, Junming Fan, Pai Zheng, Ruqiang Yan, Lihui Wang\",\"doi\":\"10.1016/j.eng.2025.04.028\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"In smart manufacturing, autonomous mobile robots play an indispensable role in conducting inspection and material handling operations, yet they face significant limitations regarding adaptability and resilience within unstructured environments. Vision and language navigation (VLN), a human-guided navigation paradigm, emerges as a compelling solution to these challenges. Nevertheless, VLN’s practical implementation is constrained by limited task generalization capabilities, inadequate response to diverse linguistic commands, and insufficient consideration of sensor-induced noise in environmental perception. This research addresses these limitations by introducing an innovative vision-language model (VLM)-based human-guided mobile robot navigation approach in an unstructured environment for human-centric smart manufacturing (HSM). This approach encompasses robust Three-dimensional (3D) scene reconstruction through advanced point cloud techniques, zero-shot semantic segmentation via a VLM, and natural language processing through a large language model (LLM) to interpret instructions and generate control code for navigation. The system’s efficacy is validated through extensive experiments in an unstructured manufacturing setup.\",\"PeriodicalId\":11783,\"journal\":{\"name\":\"Engineering\",\"volume\":\"15 1\",\"pages\":\"\"},\"PeriodicalIF\":10.1000,\"publicationDate\":\"2025-07-15\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"0\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"Engineering\",\"FirstCategoryId\":\"5\",\"ListUrlMain\":\"https://doi.org/10.1016/j.eng.2025.04.028\",\"RegionNum\":1,\"RegionCategory\":\"工程技术\",\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"Q1\",\"JCRName\":\"ENGINEERING, MULTIDISCIPLINARY\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"Engineering","FirstCategoryId":"5","ListUrlMain":"https://doi.org/10.1016/j.eng.2025.04.028","RegionNum":1,"RegionCategory":"工程技术","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q1","JCRName":"ENGINEERING, MULTIDISCIPLINARY","Score":null,"Total":0}
Vision-Language Model-Based Human-Guided Mobile Robot Navigation in an Unstructured Environment for Human-Centric Smart Manufacturing
In smart manufacturing, autonomous mobile robots play an indispensable role in conducting inspection and material handling operations, yet they face significant limitations regarding adaptability and resilience within unstructured environments. Vision and language navigation (VLN), a human-guided navigation paradigm, emerges as a compelling solution to these challenges. Nevertheless, VLN’s practical implementation is constrained by limited task generalization capabilities, inadequate response to diverse linguistic commands, and insufficient consideration of sensor-induced noise in environmental perception. This research addresses these limitations by introducing an innovative vision-language model (VLM)-based human-guided mobile robot navigation approach in an unstructured environment for human-centric smart manufacturing (HSM). This approach encompasses robust Three-dimensional (3D) scene reconstruction through advanced point cloud techniques, zero-shot semantic segmentation via a VLM, and natural language processing through a large language model (LLM) to interpret instructions and generate control code for navigation. The system’s efficacy is validated through extensive experiments in an unstructured manufacturing setup.
期刊介绍:
Engineering, an international open-access journal initiated by the Chinese Academy of Engineering (CAE) in 2015, serves as a distinguished platform for disseminating cutting-edge advancements in engineering R&D, sharing major research outputs, and highlighting key achievements worldwide. The journal's objectives encompass reporting progress in engineering science, fostering discussions on hot topics, addressing areas of interest, challenges, and prospects in engineering development, while considering human and environmental well-being and ethics in engineering. It aims to inspire breakthroughs and innovations with profound economic and social significance, propelling them to advanced international standards and transforming them into a new productive force. Ultimately, this endeavor seeks to bring about positive changes globally, benefit humanity, and shape a new future.