融合 CNN 和注意力机制,改进实时室内人类活动识别,对家庭物理康复练习进行分类。

IF 7 2区 医学 Q1 BIOLOGY
Moamen Zaher, Amr S Ghoneim, Laila Abdelhamid, Ayman Atia
{"title":"融合 CNN 和注意力机制,改进实时室内人类活动识别,对家庭物理康复练习进行分类。","authors":"Moamen Zaher, Amr S Ghoneim, Laila Abdelhamid, Ayman Atia","doi":"10.1016/j.compbiomed.2024.109399","DOIUrl":null,"url":null,"abstract":"<p><p>Physical rehabilitation plays a critical role in enhancing health outcomes globally. However, the shortage of physiotherapists, particularly in developing countries where the ratio is approximately ten physiotherapists per million people, poses a significant challenge to effective rehabilitation services. The existing literature on rehabilitation often falls short in data representation and the employment of diverse modalities, limiting the potential for advanced therapeutic interventions. To address this gap, This study integrates Computer Vision and Human Activity Recognition (HAR) technologies to support home-based rehabilitation. The study mitigates this gap by exploring various modalities and proposing a framework for data representation. We introduce a novel framework that leverages both Continuous Wavelet Transform (CWT) and Mel-Frequency Cepstral Coefficients (MFCC) for skeletal data representation. CWT is particularly valuable for capturing the time-frequency characteristics of dynamic movements involved in rehabilitation exercises, enabling a comprehensive depiction of both temporal and spectral features. This dual capability is crucial for accurately modelling the complex and variable nature of rehabilitation exercises. In our analysis, we evaluate 20 CNN-based models and one Vision Transformer (ViT) model. Additionally, we propose 12 hybrid architectures that combine CNN-based models with ViT in bi-model and tri-model configurations. These models are rigorously tested on the UI-PRMD and KIMORE benchmark datasets using key evaluation metrics, including accuracy, precision, recall, and F1-score, with 5-fold cross-validation. Our evaluation also considers real-time performance, model size, and efficiency on low-power devices, emphasising practical applicability. The proposed fused tri-model architectures outperform both single-architectures and bi-model configurations, demonstrating robust performance across both datasets and making the fused models the preferred choice for rehabilitation tasks. Our proposed hybrid model, DenMobVit, consistently surpasses state-of-the-art methods, achieving accuracy improvements of 2.9% and 1.97% on the UI-PRMD and KIMORE datasets, respectively. These findings highlight the effectiveness of our approach in advancing rehabilitation technologies and bridging the gap in physiotherapy services.</p>","PeriodicalId":10578,"journal":{"name":"Computers in biology and medicine","volume":"184 ","pages":"109399"},"PeriodicalIF":7.0000,"publicationDate":"2024-11-25","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":"{\"title\":\"Fusing CNNs and attention-mechanisms to improve real-time indoor Human Activity Recognition for classifying home-based physical rehabilitation exercises.\",\"authors\":\"Moamen Zaher, Amr S Ghoneim, Laila Abdelhamid, Ayman Atia\",\"doi\":\"10.1016/j.compbiomed.2024.109399\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"<p><p>Physical rehabilitation plays a critical role in enhancing health outcomes globally. However, the shortage of physiotherapists, particularly in developing countries where the ratio is approximately ten physiotherapists per million people, poses a significant challenge to effective rehabilitation services. The existing literature on rehabilitation often falls short in data representation and the employment of diverse modalities, limiting the potential for advanced therapeutic interventions. To address this gap, This study integrates Computer Vision and Human Activity Recognition (HAR) technologies to support home-based rehabilitation. The study mitigates this gap by exploring various modalities and proposing a framework for data representation. We introduce a novel framework that leverages both Continuous Wavelet Transform (CWT) and Mel-Frequency Cepstral Coefficients (MFCC) for skeletal data representation. CWT is particularly valuable for capturing the time-frequency characteristics of dynamic movements involved in rehabilitation exercises, enabling a comprehensive depiction of both temporal and spectral features. This dual capability is crucial for accurately modelling the complex and variable nature of rehabilitation exercises. In our analysis, we evaluate 20 CNN-based models and one Vision Transformer (ViT) model. Additionally, we propose 12 hybrid architectures that combine CNN-based models with ViT in bi-model and tri-model configurations. These models are rigorously tested on the UI-PRMD and KIMORE benchmark datasets using key evaluation metrics, including accuracy, precision, recall, and F1-score, with 5-fold cross-validation. Our evaluation also considers real-time performance, model size, and efficiency on low-power devices, emphasising practical applicability. The proposed fused tri-model architectures outperform both single-architectures and bi-model configurations, demonstrating robust performance across both datasets and making the fused models the preferred choice for rehabilitation tasks. Our proposed hybrid model, DenMobVit, consistently surpasses state-of-the-art methods, achieving accuracy improvements of 2.9% and 1.97% on the UI-PRMD and KIMORE datasets, respectively. These findings highlight the effectiveness of our approach in advancing rehabilitation technologies and bridging the gap in physiotherapy services.</p>\",\"PeriodicalId\":10578,\"journal\":{\"name\":\"Computers in biology and medicine\",\"volume\":\"184 \",\"pages\":\"109399\"},\"PeriodicalIF\":7.0000,\"publicationDate\":\"2024-11-25\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"0\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"Computers in biology and medicine\",\"FirstCategoryId\":\"5\",\"ListUrlMain\":\"https://doi.org/10.1016/j.compbiomed.2024.109399\",\"RegionNum\":2,\"RegionCategory\":\"医学\",\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"Q1\",\"JCRName\":\"BIOLOGY\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"Computers in biology and medicine","FirstCategoryId":"5","ListUrlMain":"https://doi.org/10.1016/j.compbiomed.2024.109399","RegionNum":2,"RegionCategory":"医学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q1","JCRName":"BIOLOGY","Score":null,"Total":0}
引用次数: 0

摘要

物理康复在提高全球健康水平方面发挥着至关重要的作用。然而,理疗师的短缺,尤其是在发展中国家,每百万人中大约只有 10 名理疗师,给有效的康复服务带来了巨大挑战。现有的康复文献往往缺乏数据表示和对不同模式的运用,从而限制了先进治疗干预的潜力。为了弥补这一不足,本研究整合了计算机视觉和人类活动识别(HAR)技术,以支持家庭康复。本研究通过探索各种模式和提出数据表示框架来缩小这一差距。我们引入了一个新颖的框架,利用连续小波变换(CWT)和梅尔-频率倒频谱系数(MFCC)进行骨骼数据表示。连续小波变换对于捕捉康复训练中动态运动的时频特征尤为重要,能够全面描述时间和频谱特征。这种双重能力对于准确模拟复杂多变的康复训练至关重要。在分析中,我们评估了 20 个基于 CNN 的模型和一个视觉转换器(ViT)模型。此外,我们还提出了 12 种混合架构,将基于 CNN 的模型与 ViT 以双模型和三模型配置相结合。我们在 UI-PRMD 和 KIMORE 基准数据集上对这些模型进行了严格测试,使用的主要评估指标包括准确率、精确度、召回率和 F1 分数,并进行了 5 倍交叉验证。我们的评估还考虑了实时性能、模型大小以及在低功耗设备上的效率,强调了实用性。所提出的融合三模型架构优于单架构和双模型配置,在两个数据集上都表现出强劲的性能,使融合模型成为康复任务的首选。我们提出的混合模型 DenMobVit 不断超越最先进的方法,在 UI-PRMD 和 KIMORE 数据集上的准确率分别提高了 2.9% 和 1.97%。这些发现凸显了我们的方法在推进康复技术和缩小理疗服务差距方面的有效性。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
Fusing CNNs and attention-mechanisms to improve real-time indoor Human Activity Recognition for classifying home-based physical rehabilitation exercises.

Physical rehabilitation plays a critical role in enhancing health outcomes globally. However, the shortage of physiotherapists, particularly in developing countries where the ratio is approximately ten physiotherapists per million people, poses a significant challenge to effective rehabilitation services. The existing literature on rehabilitation often falls short in data representation and the employment of diverse modalities, limiting the potential for advanced therapeutic interventions. To address this gap, This study integrates Computer Vision and Human Activity Recognition (HAR) technologies to support home-based rehabilitation. The study mitigates this gap by exploring various modalities and proposing a framework for data representation. We introduce a novel framework that leverages both Continuous Wavelet Transform (CWT) and Mel-Frequency Cepstral Coefficients (MFCC) for skeletal data representation. CWT is particularly valuable for capturing the time-frequency characteristics of dynamic movements involved in rehabilitation exercises, enabling a comprehensive depiction of both temporal and spectral features. This dual capability is crucial for accurately modelling the complex and variable nature of rehabilitation exercises. In our analysis, we evaluate 20 CNN-based models and one Vision Transformer (ViT) model. Additionally, we propose 12 hybrid architectures that combine CNN-based models with ViT in bi-model and tri-model configurations. These models are rigorously tested on the UI-PRMD and KIMORE benchmark datasets using key evaluation metrics, including accuracy, precision, recall, and F1-score, with 5-fold cross-validation. Our evaluation also considers real-time performance, model size, and efficiency on low-power devices, emphasising practical applicability. The proposed fused tri-model architectures outperform both single-architectures and bi-model configurations, demonstrating robust performance across both datasets and making the fused models the preferred choice for rehabilitation tasks. Our proposed hybrid model, DenMobVit, consistently surpasses state-of-the-art methods, achieving accuracy improvements of 2.9% and 1.97% on the UI-PRMD and KIMORE datasets, respectively. These findings highlight the effectiveness of our approach in advancing rehabilitation technologies and bridging the gap in physiotherapy services.

求助全文
通过发布文献求助,成功后即可免费获取论文全文。 去求助
来源期刊
Computers in biology and medicine
Computers in biology and medicine 工程技术-工程:生物医学
CiteScore
11.70
自引率
10.40%
发文量
1086
审稿时长
74 days
期刊介绍: Computers in Biology and Medicine is an international forum for sharing groundbreaking advancements in the use of computers in bioscience and medicine. This journal serves as a medium for communicating essential research, instruction, ideas, and information regarding the rapidly evolving field of computer applications in these domains. By encouraging the exchange of knowledge, we aim to facilitate progress and innovation in the utilization of computers in biology and medicine.
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
确定
请完成安全验证×
copy
已复制链接
快去分享给好友吧!
我知道了
右上角分享
点击右上角分享
0
联系我们:info@booksci.cn Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。 Copyright © 2023 布克学术 All rights reserved.
京ICP备2023020795号-1
ghs 京公网安备 11010802042870号
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术官方微信