通过多模态语言模型和空间智能增强自然人机协作:途径和观点

IF 11.4 1区 计算机科学 Q1 COMPUTER SCIENCE, INTERDISCIPLINARY APPLICATIONS
Duidi Wu , Pai Zheng , Qianyou Zhao , Shuo Zhang , Jin Qi , Jie Hu , Guo-Niu Zhu , Lihui Wang
{"title":"通过多模态语言模型和空间智能增强自然人机协作:途径和观点","authors":"Duidi Wu ,&nbsp;Pai Zheng ,&nbsp;Qianyou Zhao ,&nbsp;Shuo Zhang ,&nbsp;Jin Qi ,&nbsp;Jie Hu ,&nbsp;Guo-Niu Zhu ,&nbsp;Lihui Wang","doi":"10.1016/j.rcim.2025.103064","DOIUrl":null,"url":null,"abstract":"<div><div>Industry 5.0 advocates human-centric smart manufacturing (HSM), with growing attention to proactive human-machine collaboration (HRC). Meanwhile, the rapid development of Multimodal large language models (MLLMs) and embodied intelligence is driving an unprecedented evolution. This work aims to leverage these opportunities to enhance robots’ learning and cognitive capabilities, enabling seamless and natural interaction. However, current research often overlooks human–robot symbiosis and lacks attention to specialized models and practical applications. This review adheres to a human-centric vision, taking language as the pivot to connect humans with large models. To our best knowledge, this is the first attempt to integrate HRC, MLLMs and embodied intelligence into a holistic view. The review first introduces representative foundation models to provide a comprehensive summary of state-of-the-art methods in the ”Perception-Cognition-Actuation” loop. It then discusses pathways and platforms for efficient spatial skills learning, followed by an analysis of four key questions from the ”Why, How, What, Where” perspectives. Finally, it highlights future challenges and potential research directions. It is hoped that this work can help fill the research gap between HRC and MLLMs, offering a systematic pathway for developing human-centered collaborative systems and promoting further exploration and innovation in this exciting and crucial field. The resources are available at: <span><span>https://github.com/WuDuidi/MLLM-HRC-Survey</span><svg><path></path></svg></span>.</div></div>","PeriodicalId":21452,"journal":{"name":"Robotics and Computer-integrated Manufacturing","volume":"97 ","pages":"Article 103064"},"PeriodicalIF":11.4000,"publicationDate":"2025-06-11","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":"{\"title\":\"Empowering natural human–robot collaboration through multimodal language models and spatial intelligence: Pathways and perspectives\",\"authors\":\"Duidi Wu ,&nbsp;Pai Zheng ,&nbsp;Qianyou Zhao ,&nbsp;Shuo Zhang ,&nbsp;Jin Qi ,&nbsp;Jie Hu ,&nbsp;Guo-Niu Zhu ,&nbsp;Lihui Wang\",\"doi\":\"10.1016/j.rcim.2025.103064\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"<div><div>Industry 5.0 advocates human-centric smart manufacturing (HSM), with growing attention to proactive human-machine collaboration (HRC). Meanwhile, the rapid development of Multimodal large language models (MLLMs) and embodied intelligence is driving an unprecedented evolution. This work aims to leverage these opportunities to enhance robots’ learning and cognitive capabilities, enabling seamless and natural interaction. However, current research often overlooks human–robot symbiosis and lacks attention to specialized models and practical applications. This review adheres to a human-centric vision, taking language as the pivot to connect humans with large models. To our best knowledge, this is the first attempt to integrate HRC, MLLMs and embodied intelligence into a holistic view. The review first introduces representative foundation models to provide a comprehensive summary of state-of-the-art methods in the ”Perception-Cognition-Actuation” loop. It then discusses pathways and platforms for efficient spatial skills learning, followed by an analysis of four key questions from the ”Why, How, What, Where” perspectives. Finally, it highlights future challenges and potential research directions. It is hoped that this work can help fill the research gap between HRC and MLLMs, offering a systematic pathway for developing human-centered collaborative systems and promoting further exploration and innovation in this exciting and crucial field. The resources are available at: <span><span>https://github.com/WuDuidi/MLLM-HRC-Survey</span><svg><path></path></svg></span>.</div></div>\",\"PeriodicalId\":21452,\"journal\":{\"name\":\"Robotics and Computer-integrated Manufacturing\",\"volume\":\"97 \",\"pages\":\"Article 103064\"},\"PeriodicalIF\":11.4000,\"publicationDate\":\"2025-06-11\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"0\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"Robotics and Computer-integrated Manufacturing\",\"FirstCategoryId\":\"94\",\"ListUrlMain\":\"https://www.sciencedirect.com/science/article/pii/S0736584525001188\",\"RegionNum\":1,\"RegionCategory\":\"计算机科学\",\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"Q1\",\"JCRName\":\"COMPUTER SCIENCE, INTERDISCIPLINARY APPLICATIONS\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"Robotics and Computer-integrated Manufacturing","FirstCategoryId":"94","ListUrlMain":"https://www.sciencedirect.com/science/article/pii/S0736584525001188","RegionNum":1,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q1","JCRName":"COMPUTER SCIENCE, INTERDISCIPLINARY APPLICATIONS","Score":null,"Total":0}
引用次数: 0

摘要

工业5.0倡导以人为中心的智能制造(HSM),越来越关注主动人机协作(HRC)。与此同时,多模态大语言模型(Multimodal large language models, mllm)和具身智能(embodied intelligence)的快速发展正在推动一场前所未有的进化。这项工作旨在利用这些机会来增强机器人的学习和认知能力,实现无缝和自然的交互。然而,目前的研究往往忽视了人机共生,缺乏对专业模型和实际应用的关注。这篇综述坚持以人为中心的观点,将语言作为连接人类与大模型的枢纽。据我们所知,这是第一次尝试将HRC、mlms和具身智力整合成一个整体的观点。本文首先介绍了具有代表性的基础模型,对“感知-认知-驱动”循环中最先进的方法进行了全面总结。然后讨论了高效空间技能学习的途径和平台,然后从“为什么、如何、什么、在哪里”的角度分析了四个关键问题。最后,指出了未来面临的挑战和潜在的研究方向。希望本工作能够填补HRC与mlms之间的研究空白,为开发以人为中心的协作系统提供系统途径,推动这一激动人心的关键领域的进一步探索和创新。这些资源可在https://github.com/WuDuidi/MLLM-HRC-Survey上获得。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
Empowering natural human–robot collaboration through multimodal language models and spatial intelligence: Pathways and perspectives
Industry 5.0 advocates human-centric smart manufacturing (HSM), with growing attention to proactive human-machine collaboration (HRC). Meanwhile, the rapid development of Multimodal large language models (MLLMs) and embodied intelligence is driving an unprecedented evolution. This work aims to leverage these opportunities to enhance robots’ learning and cognitive capabilities, enabling seamless and natural interaction. However, current research often overlooks human–robot symbiosis and lacks attention to specialized models and practical applications. This review adheres to a human-centric vision, taking language as the pivot to connect humans with large models. To our best knowledge, this is the first attempt to integrate HRC, MLLMs and embodied intelligence into a holistic view. The review first introduces representative foundation models to provide a comprehensive summary of state-of-the-art methods in the ”Perception-Cognition-Actuation” loop. It then discusses pathways and platforms for efficient spatial skills learning, followed by an analysis of four key questions from the ”Why, How, What, Where” perspectives. Finally, it highlights future challenges and potential research directions. It is hoped that this work can help fill the research gap between HRC and MLLMs, offering a systematic pathway for developing human-centered collaborative systems and promoting further exploration and innovation in this exciting and crucial field. The resources are available at: https://github.com/WuDuidi/MLLM-HRC-Survey.
求助全文
通过发布文献求助,成功后即可免费获取论文全文。 去求助
来源期刊
Robotics and Computer-integrated Manufacturing
Robotics and Computer-integrated Manufacturing 工程技术-工程:制造
CiteScore
24.10
自引率
13.50%
发文量
160
审稿时长
50 days
期刊介绍: The journal, Robotics and Computer-Integrated Manufacturing, focuses on sharing research applications that contribute to the development of new or enhanced robotics, manufacturing technologies, and innovative manufacturing strategies that are relevant to industry. Papers that combine theory and experimental validation are preferred, while review papers on current robotics and manufacturing issues are also considered. However, papers on traditional machining processes, modeling and simulation, supply chain management, and resource optimization are generally not within the scope of the journal, as there are more appropriate journals for these topics. Similarly, papers that are overly theoretical or mathematical will be directed to other suitable journals. The journal welcomes original papers in areas such as industrial robotics, human-robot collaboration in manufacturing, cloud-based manufacturing, cyber-physical production systems, big data analytics in manufacturing, smart mechatronics, machine learning, adaptive and sustainable manufacturing, and other fields involving unique manufacturing technologies.
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
确定
请完成安全验证×
copy
已复制链接
快去分享给好友吧!
我知道了
右上角分享
点击右上角分享
0
联系我们:info@booksci.cn Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。 Copyright © 2023 布克学术 All rights reserved.
京ICP备2023020795号-1
ghs 京公网安备 11010802042870号
Book学术文献互助
Book学术文献互助群
群 号:604180095
Book学术官方微信