The effect of depth data and upper limb impairment on lightweight monocular RGB human pose estimation models.

IF 2.9 4区医学 Q3 ENGINEERING, BIOMEDICAL

BioMedical Engineering OnLine Pub Date : 2025-02-07 DOI:10.1186/s12938-025-01347-y

Gloria-Edith Boudreault-Morales, Cesar Marquez-Chin, Xilin Liu, José Zariffa

{"title":"The effect of depth data and upper limb impairment on lightweight monocular RGB human pose estimation models.","authors":"Gloria-Edith Boudreault-Morales, Cesar Marquez-Chin, Xilin Liu, José Zariffa","doi":"10.1186/s12938-025-01347-y","DOIUrl":null,"url":null,"abstract":"Background and objectives: Markerless vision-based human pose estimation (HPE) is a promising avenue towards scalable data collection in rehabilitation. Deploying this technology will require self-contained systems able to process data efficiently and accurately. The aims of this work are to (1) Determine how depth data affects lightweight monocular red-green-blue (RGB) HPE performance (accuracy and speed), to inform sensor selection and (2) Validate HPE models using data from individuals with physical impairments.Methods: Two HPE models were investigated: Dite-HRNet and MobileHumanPose (capable of 2D and 3D HPE, respectively). The models were modified to include depth data as an input using three different fusion techniques: an early fusion method, a simple intermediate fusion method (using concatenation), and a complex intermediate fusion method (using specific fusion blocks, additional convolutional layers, and concatenation). All fusion techniques used RGB-D data, in contrast to the original models which only used RGB data. The models were trained, validated and tested using the CMU Panoptic and Human3.6 M data sets as well as a custom data set. The custom data set includes RGB-D and optical motion capture data of 15 uninjured and 12 post-stroke individuals, while they performed movements involving their upper limbs. HPE model performances were monitored through accuracy and computational efficiency. Evaluation metrics include Mean per Joint Position Error (MPJPE), Floating Point Operations (FLOPs) and frame rates (frames per second).Results: The early fusion architecture consistently delivered the lowest MPJPE in both 2D and 3D HPE cases while achieving similar FLOPs and frame rates to its RGB counterpart. These results were consistent regardless of the data used for training and testing the HPE models. Comparisons between the uninjured and stroke groups did not reveal a significant effect (all p values > 0.36) of motor impairment on the accuracy of any model.Conclusions: Including depth data using an early fusion architecture improves the accuracy-efficiency trade-off of the HPE model. HPE accuracy is not affected by the presence of physical impairments. These results suggest that using depth data with RGB data is beneficial to HPE, and that models trained with data collected from uninjured individuals can generalize to persons with physical impairments.","PeriodicalId":8927,"journal":{"name":"BioMedical Engineering OnLine","volume":"24 1","pages":"12"},"PeriodicalIF":2.9000,"publicationDate":"2025-02-07","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC11804014/pdf/","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"BioMedical Engineering OnLine","FirstCategoryId":"5","ListUrlMain":"https://doi.org/10.1186/s12938-025-01347-y","RegionNum":4,"RegionCategory":"医学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q3","JCRName":"ENGINEERING, BIOMEDICAL","Score":null,"Total":0}

引用次数: 0

Abstract

Background and objectives: Markerless vision-based human pose estimation (HPE) is a promising avenue towards scalable data collection in rehabilitation. Deploying this technology will require self-contained systems able to process data efficiently and accurately. The aims of this work are to (1) Determine how depth data affects lightweight monocular red-green-blue (RGB) HPE performance (accuracy and speed), to inform sensor selection and (2) Validate HPE models using data from individuals with physical impairments.

Methods: Two HPE models were investigated: Dite-HRNet and MobileHumanPose (capable of 2D and 3D HPE, respectively). The models were modified to include depth data as an input using three different fusion techniques: an early fusion method, a simple intermediate fusion method (using concatenation), and a complex intermediate fusion method (using specific fusion blocks, additional convolutional layers, and concatenation). All fusion techniques used RGB-D data, in contrast to the original models which only used RGB data. The models were trained, validated and tested using the CMU Panoptic and Human3.6 M data sets as well as a custom data set. The custom data set includes RGB-D and optical motion capture data of 15 uninjured and 12 post-stroke individuals, while they performed movements involving their upper limbs. HPE model performances were monitored through accuracy and computational efficiency. Evaluation metrics include Mean per Joint Position Error (MPJPE), Floating Point Operations (FLOPs) and frame rates (frames per second).

Results: The early fusion architecture consistently delivered the lowest MPJPE in both 2D and 3D HPE cases while achieving similar FLOPs and frame rates to its RGB counterpart. These results were consistent regardless of the data used for training and testing the HPE models. Comparisons between the uninjured and stroke groups did not reveal a significant effect (all p values > 0.36) of motor impairment on the accuracy of any model.

Conclusions: Including depth data using an early fusion architecture improves the accuracy-efficiency trade-off of the HPE model. HPE accuracy is not affected by the presence of physical impairments. These results suggest that using depth data with RGB data is beneficial to HPE, and that models trained with data collected from uninjured individuals can generalize to persons with physical impairments.

查看原文本刊更多论文

深度数据和上肢损伤对轻量级单目RGB人体姿态估计模型的影响。

背景和目的：基于无标记视觉的人体姿态估计（HPE）是一种很有前途的康复数据收集方法。部署这项技术将需要能够高效、准确地处理数据的独立系统。这项工作的目的是：(1)确定深度数据如何影响轻量级单目红绿蓝（RGB） HPE性能（精度和速度），为传感器选择提供信息；(2)使用来自身体障碍个体的数据验证HPE模型。方法：研究了两种HPE模型：Dite-HRNet和MobileHumanPose（分别支持二维和三维HPE）。通过三种不同的融合技术对模型进行修改，将深度数据作为输入：一种早期融合方法，一种简单的中间融合方法（使用连接），一种复杂的中间融合方法（使用特定的融合块，额外的卷积层和连接）。与仅使用RGB数据的原始模型相比，所有融合技术都使用RGB- d数据。使用CMU Panoptic和Human3.6 M数据集以及自定义数据集对模型进行了训练、验证和测试。自定义数据集包括15名未受伤和12名中风后患者进行上肢运动时的RGB-D和光学动作捕捉数据。通过精度和计算效率监测HPE模型的性能。评估指标包括平均每个关节位置误差（MPJPE），浮点运算（FLOPs）和帧率（每秒帧数）。结果：早期融合架构在2D和3D HPE情况下始终提供最低的MPJPE，同时获得与RGB对应的相似的FLOPs和帧率。无论用于训练和测试HPE模型的数据如何，这些结果都是一致的。未损伤组和中风组之间的比较未显示运动损伤对任何模型的准确性有显著影响（p值均为> 0.36）。结论：使用早期融合架构包含深度数据可以改善HPE模型的精度和效率权衡。HPE的准确性不受身体缺陷的影响。这些结果表明，将深度数据与RGB数据结合使用有利于HPE，并且使用从未受伤个体收集的数据训练的模型可以推广到有身体缺陷的人。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

求助全文

约1分钟内获得全文求助全文

来源期刊

BioMedical Engineering OnLine 工程技术-工程：生物医学

CiteScore

6.70

自引率

2.60%

发文量

审稿时长

1 months

期刊介绍： BioMedical Engineering OnLine is an open access, peer-reviewed journal that is dedicated to publishing research in all areas of biomedical engineering. BioMedical Engineering OnLine is aimed at readers and authors throughout the world, with an interest in using tools of the physical and data sciences and techniques in engineering to understand and solve problems in the biological and medical sciences. Topical areas include, but are not limited to: Bioinformatics- Bioinstrumentation- Biomechanics- Biomedical Devices & Instrumentation- Biomedical Signal Processing- Healthcare Information Systems- Human Dynamics- Neural Engineering- Rehabilitation Engineering- Biomaterials- Biomedical Imaging & Image Processing- BioMEMS and On-Chip Devices- Bio-Micro/Nano Technologies- Biomolecular Engineering- Biosensors- Cardiovascular Systems Engineering- Cellular Engineering- Clinical Engineering- Computational Biology- Drug Delivery Technologies- Modeling Methodologies- Nanomaterials and Nanotechnology in Biomedicine- Respiratory Systems Engineering- Robotics in Medicine- Systems and Synthetic Biology- Systems Biology- Telemedicine/Smartphone Applications in Medicine- Therapeutic Systems, Devices and Technologies- Tissue Engineering