Kaiwu: A Multimodal Manipulation Dataset and Framework for Robot Learning and Human-Robot Interaction

IF 5.3 2区 计算机科学 Q2 ROBOTICS
Shuo Jiang;Haonan Li;Ruochen Ren;Yanmin Zhou;Zhipeng Wang;Bin He
{"title":"Kaiwu: A Multimodal Manipulation Dataset and Framework for Robot Learning and Human-Robot Interaction","authors":"Shuo Jiang;Haonan Li;Ruochen Ren;Yanmin Zhou;Zhipeng Wang;Bin He","doi":"10.1109/LRA.2025.3609615","DOIUrl":null,"url":null,"abstract":"Cutting-edge robot learning techniques including foundation models and imitation learning from humans all pose huge demands on large-scale and high-quality datasets which constitute one of the bottleneck in the general intelligent robot fields. This paper presents the Kaiwu multimodal dataset to address the missing real-world synchronized multimodal data problems in the sophisticated assembling scenario, especially with dynamics information and its fine-grained labelling. The dataset first provides an integration of human, environment and robot data collection framework with 20 subjects and 30 interaction objects resulting in totally 11,664 instances of integrated actions. For each of the demonstration, hand motions, operation pressures, sounds of the assembling process, multi-view videos, high-precision motion capture information, eye gaze with first-person videos, electromyography signals are all recorded. Fine-grained multi-level annotation based on absolute timestamp, and semantic segmentation labelling are performed. Kaiwu dataset aims to facilitate robot learning, dexterous manipulation, human intention investigation and human-robot collaboration research.","PeriodicalId":13241,"journal":{"name":"IEEE Robotics and Automation Letters","volume":"10 11","pages":"11482-11489"},"PeriodicalIF":5.3000,"publicationDate":"2025-09-12","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"IEEE Robotics and Automation Letters","FirstCategoryId":"94","ListUrlMain":"https://ieeexplore.ieee.org/document/11160665/","RegionNum":2,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q2","JCRName":"ROBOTICS","Score":null,"Total":0}
引用次数: 0

Abstract

Cutting-edge robot learning techniques including foundation models and imitation learning from humans all pose huge demands on large-scale and high-quality datasets which constitute one of the bottleneck in the general intelligent robot fields. This paper presents the Kaiwu multimodal dataset to address the missing real-world synchronized multimodal data problems in the sophisticated assembling scenario, especially with dynamics information and its fine-grained labelling. The dataset first provides an integration of human, environment and robot data collection framework with 20 subjects and 30 interaction objects resulting in totally 11,664 instances of integrated actions. For each of the demonstration, hand motions, operation pressures, sounds of the assembling process, multi-view videos, high-precision motion capture information, eye gaze with first-person videos, electromyography signals are all recorded. Fine-grained multi-level annotation based on absolute timestamp, and semantic segmentation labelling are performed. Kaiwu dataset aims to facilitate robot learning, dexterous manipulation, human intention investigation and human-robot collaboration research.
面向机器人学习和人机交互的多模态操作数据集与框架
包括基础模型和人类模仿学习在内的前沿机器人学习技术都对大规模、高质量的数据集提出了巨大的需求,这是通用智能机器人领域的瓶颈之一。本文提出了开武多模态数据集,以解决复杂装配场景中缺少真实世界同步多模态数据的问题,特别是动态信息及其细粒度标记。该数据集首先提供了一个整合人类、环境和机器人的数据收集框架,包含20个主题和30个交互对象,总共产生11,664个集成动作实例。每一次演示,都记录了手部动作、操作压力、装配过程的声音、多视角视频、高精度动作捕捉信息、第一人称视频的眼神注视、肌电信号。实现了基于绝对时间戳的细粒度多级标注和语义分段标注。Kaiwu数据集旨在促进机器人学习、灵巧操作、人类意图调查和人机协作研究。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
求助全文
约1分钟内获得全文 求助全文
来源期刊
IEEE Robotics and Automation Letters
IEEE Robotics and Automation Letters Computer Science-Computer Science Applications
CiteScore
9.60
自引率
15.40%
发文量
1428
期刊介绍: The scope of this journal is to publish peer-reviewed articles that provide a timely and concise account of innovative research ideas and application results, reporting significant theoretical findings and application case studies in areas of robotics and automation.
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
确定
请完成安全验证×
copy
已复制链接
快去分享给好友吧!
我知道了
右上角分享
点击右上角分享
0
联系我们:info@booksci.cn Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。 Copyright © 2023 布克学术 All rights reserved.
京ICP备2023020795号-1
ghs 京公网安备 11010802042870号
Book学术文献互助
Book学术文献互助群
群 号:604180095
Book学术官方微信