{"title":"SUMediPose: A 2D-3D pose estimation dataset","authors":"Chris-Mari Schreuder , Oloff Bergh , Lizé Steyn , Rensu P. Theart","doi":"10.1016/j.dib.2025.111579","DOIUrl":null,"url":null,"abstract":"<div><div>Biomechanical movement analysis is crucial in medical and sports contexts, yet the technology remains expensive and inaccessible to many. Recent advancements in machine learning and computer vision, particularly in Pose Estimation (PE), offer promising alternatives. PE models detect key points on the human body to estimate its pose in either 2D or 3D space, enabling markerless motion capture. This approach facilitates more natural and flexible movement tracking without the need for physical markers. However, markerless systems generally lack the accuracy of marker-based methods and require extensive annotated data for training, which is often anatomically inaccurate. Additionally, current 3D pose estimation techniques face practical challenges, including complex hardware setups, intricate camera calibrations, and a shortage of reliable ground truth 2D-3D datasets.</div><div>To address these challenges, we introduce a multimodal dataset comprising 3,444 recordings, 2,896,943 image frames, and 3,804,413 corresponding 3D and 2D marker-based motion capture keypoint coordinates. The dataset includes 28 participants performing eight strength and conditioning actions at three different speeds, with full image and keypoint data available for 26 participants, while two participants have only keypoint data without accompanying image data. Video and image data were captured using a custom-developed multi-RGB-camera system, while the marker-based 3D data was acquired using the Vicon system and subsequently projected into each camera’s internal coordinate system, represented in both 3D space and 2D image space. The multi-RGB-camera system consists of six cameras arranged in a circular formation around the subject, offering a full 360° view of the scene from the same height and resulting in a diverse set of viewing angles. The recording setup was designed to allow both capture systems to record participants' movements simultaneously, synchronizing the data to provide ground truth 3D data, which was then back-projected to generate 2D-pixel keypoint data for each corresponding image frame. This design enables the dataset to support both 2D and 3D pose estimation tasks. To ensure anatomical accuracy, a professional placed an extensive array of markers on each participant, adhering to industry standards.</div><div>The dataset also includes all intrinsic and extrinsic camera parameters, as well as origin axis data, necessary for performing any 3D or 2D projections. This allows the dataset to be adjusted and tailored to meet specific research or application needs.</div></div>","PeriodicalId":10973,"journal":{"name":"Data in Brief","volume":"60 ","pages":"Article 111579"},"PeriodicalIF":1.0000,"publicationDate":"2025-04-22","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Data in Brief","FirstCategoryId":"1085","ListUrlMain":"https://www.sciencedirect.com/science/article/pii/S2352340925003117","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q3","JCRName":"MULTIDISCIPLINARY SCIENCES","Score":null,"Total":0}
引用次数: 0
Abstract
Biomechanical movement analysis is crucial in medical and sports contexts, yet the technology remains expensive and inaccessible to many. Recent advancements in machine learning and computer vision, particularly in Pose Estimation (PE), offer promising alternatives. PE models detect key points on the human body to estimate its pose in either 2D or 3D space, enabling markerless motion capture. This approach facilitates more natural and flexible movement tracking without the need for physical markers. However, markerless systems generally lack the accuracy of marker-based methods and require extensive annotated data for training, which is often anatomically inaccurate. Additionally, current 3D pose estimation techniques face practical challenges, including complex hardware setups, intricate camera calibrations, and a shortage of reliable ground truth 2D-3D datasets.
To address these challenges, we introduce a multimodal dataset comprising 3,444 recordings, 2,896,943 image frames, and 3,804,413 corresponding 3D and 2D marker-based motion capture keypoint coordinates. The dataset includes 28 participants performing eight strength and conditioning actions at three different speeds, with full image and keypoint data available for 26 participants, while two participants have only keypoint data without accompanying image data. Video and image data were captured using a custom-developed multi-RGB-camera system, while the marker-based 3D data was acquired using the Vicon system and subsequently projected into each camera’s internal coordinate system, represented in both 3D space and 2D image space. The multi-RGB-camera system consists of six cameras arranged in a circular formation around the subject, offering a full 360° view of the scene from the same height and resulting in a diverse set of viewing angles. The recording setup was designed to allow both capture systems to record participants' movements simultaneously, synchronizing the data to provide ground truth 3D data, which was then back-projected to generate 2D-pixel keypoint data for each corresponding image frame. This design enables the dataset to support both 2D and 3D pose estimation tasks. To ensure anatomical accuracy, a professional placed an extensive array of markers on each participant, adhering to industry standards.
The dataset also includes all intrinsic and extrinsic camera parameters, as well as origin axis data, necessary for performing any 3D or 2D projections. This allows the dataset to be adjusted and tailored to meet specific research or application needs.
期刊介绍:
Data in Brief provides a way for researchers to easily share and reuse each other''s datasets by publishing data articles that: -Thoroughly describe your data, facilitating reproducibility. -Make your data, which is often buried in supplementary material, easier to find. -Increase traffic towards associated research articles and data, leading to more citations. -Open up doors for new collaborations. Because you never know what data will be useful to someone else, Data in Brief welcomes submissions that describe data from all research areas.