Kaiwu: A Multimodal Manipulation Dataset and Framework for Robot Learning and Human-Robot Interaction

IF 5.3 2区计算机科学 Q2 ROBOTICS

IEEE Robotics and Automation Letters Pub Date : 2025-09-12 DOI:10.1109/LRA.2025.3609615

Shuo Jiang;Haonan Li;Ruochen Ren;Yanmin Zhou;Zhipeng Wang;Bin He

{"title":"Kaiwu: A Multimodal Manipulation Dataset and Framework for Robot Learning and Human-Robot Interaction","authors":"Shuo Jiang;Haonan Li;Ruochen Ren;Yanmin Zhou;Zhipeng Wang;Bin He","doi":"10.1109/LRA.2025.3609615","DOIUrl":null,"url":null,"abstract":"Cutting-edge robot learning techniques including foundation models and imitation learning from humans all pose huge demands on large-scale and high-quality datasets which constitute one of the bottleneck in the general intelligent robot fields. This paper presents the Kaiwu multimodal dataset to address the missing real-world synchronized multimodal data problems in the sophisticated assembling scenario, especially with dynamics information and its fine-grained labelling. The dataset first provides an integration of human, environment and robot data collection framework with 20 subjects and 30 interaction objects resulting in totally 11,664 instances of integrated actions. For each of the demonstration, hand motions, operation pressures, sounds of the assembling process, multi-view videos, high-precision motion capture information, eye gaze with first-person videos, electromyography signals are all recorded. Fine-grained multi-level annotation based on absolute timestamp, and semantic segmentation labelling are performed. Kaiwu dataset aims to facilitate robot learning, dexterous manipulation, human intention investigation and human-robot collaboration research.","PeriodicalId":13241,"journal":{"name":"IEEE Robotics and Automation Letters","volume":"10 11","pages":"11482-11489"},"PeriodicalIF":5.3000,"publicationDate":"2025-09-12","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"IEEE Robotics and Automation Letters","FirstCategoryId":"94","ListUrlMain":"https://ieeexplore.ieee.org/document/11160665/","RegionNum":2,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q2","JCRName":"ROBOTICS","Score":null,"Total":0}

引用次数: 0

Abstract

Cutting-edge robot learning techniques including foundation models and imitation learning from humans all pose huge demands on large-scale and high-quality datasets which constitute one of the bottleneck in the general intelligent robot fields. This paper presents the Kaiwu multimodal dataset to address the missing real-world synchronized multimodal data problems in the sophisticated assembling scenario, especially with dynamics information and its fine-grained labelling. The dataset first provides an integration of human, environment and robot data collection framework with 20 subjects and 30 interaction objects resulting in totally 11,664 instances of integrated actions. For each of the demonstration, hand motions, operation pressures, sounds of the assembling process, multi-view videos, high-precision motion capture information, eye gaze with first-person videos, electromyography signals are all recorded. Fine-grained multi-level annotation based on absolute timestamp, and semantic segmentation labelling are performed. Kaiwu dataset aims to facilitate robot learning, dexterous manipulation, human intention investigation and human-robot collaboration research.

查看原文本刊更多论文

面向机器人学习和人机交互的多模态操作数据集与框架

包括基础模型和人类模仿学习在内的前沿机器人学习技术都对大规模、高质量的数据集提出了巨大的需求，这是通用智能机器人领域的瓶颈之一。本文提出了开武多模态数据集，以解决复杂装配场景中缺少真实世界同步多模态数据的问题，特别是动态信息及其细粒度标记。该数据集首先提供了一个整合人类、环境和机器人的数据收集框架，包含20个主题和30个交互对象，总共产生11,664个集成动作实例。每一次演示，都记录了手部动作、操作压力、装配过程的声音、多视角视频、高精度动作捕捉信息、第一人称视频的眼神注视、肌电信号。实现了基于绝对时间戳的细粒度多级标注和语义分段标注。Kaiwu数据集旨在促进机器人学习、灵巧操作、人类意图调查和人机协作研究。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

求助全文

约1分钟内获得全文求助全文

来源期刊

IEEE Robotics and Automation Letters Computer Science-Computer Science Applications

CiteScore

9.60

自引率

15.40%

发文量

1428

期刊介绍： The scope of this journal is to publish peer-reviewed articles that provide a timely and concise account of innovative research ideas and application results, reporting significant theoretical findings and application case studies in areas of robotics and automation.