The effect of depth data and upper limb impairment on lightweight monocular RGB human pose estimation models.

IF 2.9 4区 医学 Q3 ENGINEERING, BIOMEDICAL
Gloria-Edith Boudreault-Morales, Cesar Marquez-Chin, Xilin Liu, José Zariffa
{"title":"The effect of depth data and upper limb impairment on lightweight monocular RGB human pose estimation models.","authors":"Gloria-Edith Boudreault-Morales, Cesar Marquez-Chin, Xilin Liu, José Zariffa","doi":"10.1186/s12938-025-01347-y","DOIUrl":null,"url":null,"abstract":"<p><strong>Background and objectives: </strong>Markerless vision-based human pose estimation (HPE) is a promising avenue towards scalable data collection in rehabilitation. Deploying this technology will require self-contained systems able to process data efficiently and accurately. The aims of this work are to (1) Determine how depth data affects lightweight monocular red-green-blue (RGB) HPE performance (accuracy and speed), to inform sensor selection and (2) Validate HPE models using data from individuals with physical impairments.</p><p><strong>Methods: </strong>Two HPE models were investigated: Dite-HRNet and MobileHumanPose (capable of 2D and 3D HPE, respectively). The models were modified to include depth data as an input using three different fusion techniques: an early fusion method, a simple intermediate fusion method (using concatenation), and a complex intermediate fusion method (using specific fusion blocks, additional convolutional layers, and concatenation). All fusion techniques used RGB-D data, in contrast to the original models which only used RGB data. The models were trained, validated and tested using the CMU Panoptic and Human3.6 M data sets as well as a custom data set. The custom data set includes RGB-D and optical motion capture data of 15 uninjured and 12 post-stroke individuals, while they performed movements involving their upper limbs. HPE model performances were monitored through accuracy and computational efficiency. Evaluation metrics include Mean per Joint Position Error (MPJPE), Floating Point Operations (FLOPs) and frame rates (frames per second).</p><p><strong>Results: </strong>The early fusion architecture consistently delivered the lowest MPJPE in both 2D and 3D HPE cases while achieving similar FLOPs and frame rates to its RGB counterpart. These results were consistent regardless of the data used for training and testing the HPE models. Comparisons between the uninjured and stroke groups did not reveal a significant effect (all p values > 0.36) of motor impairment on the accuracy of any model.</p><p><strong>Conclusions: </strong>Including depth data using an early fusion architecture improves the accuracy-efficiency trade-off of the HPE model. HPE accuracy is not affected by the presence of physical impairments. These results suggest that using depth data with RGB data is beneficial to HPE, and that models trained with data collected from uninjured individuals can generalize to persons with physical impairments.</p>","PeriodicalId":8927,"journal":{"name":"BioMedical Engineering OnLine","volume":"24 1","pages":"12"},"PeriodicalIF":2.9000,"publicationDate":"2025-02-07","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC11804014/pdf/","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"BioMedical Engineering OnLine","FirstCategoryId":"5","ListUrlMain":"https://doi.org/10.1186/s12938-025-01347-y","RegionNum":4,"RegionCategory":"医学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q3","JCRName":"ENGINEERING, BIOMEDICAL","Score":null,"Total":0}
引用次数: 0

Abstract

Background and objectives: Markerless vision-based human pose estimation (HPE) is a promising avenue towards scalable data collection in rehabilitation. Deploying this technology will require self-contained systems able to process data efficiently and accurately. The aims of this work are to (1) Determine how depth data affects lightweight monocular red-green-blue (RGB) HPE performance (accuracy and speed), to inform sensor selection and (2) Validate HPE models using data from individuals with physical impairments.

Methods: Two HPE models were investigated: Dite-HRNet and MobileHumanPose (capable of 2D and 3D HPE, respectively). The models were modified to include depth data as an input using three different fusion techniques: an early fusion method, a simple intermediate fusion method (using concatenation), and a complex intermediate fusion method (using specific fusion blocks, additional convolutional layers, and concatenation). All fusion techniques used RGB-D data, in contrast to the original models which only used RGB data. The models were trained, validated and tested using the CMU Panoptic and Human3.6 M data sets as well as a custom data set. The custom data set includes RGB-D and optical motion capture data of 15 uninjured and 12 post-stroke individuals, while they performed movements involving their upper limbs. HPE model performances were monitored through accuracy and computational efficiency. Evaluation metrics include Mean per Joint Position Error (MPJPE), Floating Point Operations (FLOPs) and frame rates (frames per second).

Results: The early fusion architecture consistently delivered the lowest MPJPE in both 2D and 3D HPE cases while achieving similar FLOPs and frame rates to its RGB counterpart. These results were consistent regardless of the data used for training and testing the HPE models. Comparisons between the uninjured and stroke groups did not reveal a significant effect (all p values > 0.36) of motor impairment on the accuracy of any model.

Conclusions: Including depth data using an early fusion architecture improves the accuracy-efficiency trade-off of the HPE model. HPE accuracy is not affected by the presence of physical impairments. These results suggest that using depth data with RGB data is beneficial to HPE, and that models trained with data collected from uninjured individuals can generalize to persons with physical impairments.

求助全文
约1分钟内获得全文 求助全文
来源期刊
BioMedical Engineering OnLine
BioMedical Engineering OnLine 工程技术-工程:生物医学
CiteScore
6.70
自引率
2.60%
发文量
79
审稿时长
1 months
期刊介绍: BioMedical Engineering OnLine is an open access, peer-reviewed journal that is dedicated to publishing research in all areas of biomedical engineering. BioMedical Engineering OnLine is aimed at readers and authors throughout the world, with an interest in using tools of the physical and data sciences and techniques in engineering to understand and solve problems in the biological and medical sciences. Topical areas include, but are not limited to: Bioinformatics- Bioinstrumentation- Biomechanics- Biomedical Devices & Instrumentation- Biomedical Signal Processing- Healthcare Information Systems- Human Dynamics- Neural Engineering- Rehabilitation Engineering- Biomaterials- Biomedical Imaging & Image Processing- BioMEMS and On-Chip Devices- Bio-Micro/Nano Technologies- Biomolecular Engineering- Biosensors- Cardiovascular Systems Engineering- Cellular Engineering- Clinical Engineering- Computational Biology- Drug Delivery Technologies- Modeling Methodologies- Nanomaterials and Nanotechnology in Biomedicine- Respiratory Systems Engineering- Robotics in Medicine- Systems and Synthetic Biology- Systems Biology- Telemedicine/Smartphone Applications in Medicine- Therapeutic Systems, Devices and Technologies- Tissue Engineering
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
确定
请完成安全验证×
copy
已复制链接
快去分享给好友吧!
我知道了
右上角分享
点击右上角分享
0
联系我们:info@booksci.cn Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。 Copyright © 2023 布克学术 All rights reserved.
京ICP备2023020795号-1
ghs 京公网安备 11010802042870号
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术官方微信