帕金森病步态的视觉-骨骼双模态评估框架

IF 11.8 1区医学 Q1 COMPUTER SCIENCE, ARTIFICIAL INTELLIGENCE

Medical image analysis Pub Date : 2025-07-07 DOI:10.1016/j.media.2025.103727

Weiping Liu , Xiaozhen Lin , Xinghong Chen , Yifang Liu , Zengxin Zhong , Rong Chen , Guannan Chen , Yu Lin

{"title":"帕金森病步态的视觉-骨骼双模态评估框架","authors":"Weiping Liu , Xiaozhen Lin , Xinghong Chen , Yifang Liu , Zengxin Zhong , Rong Chen , Guannan Chen , Yu Lin","doi":"10.1016/j.media.2025.103727","DOIUrl":null,"url":null,"abstract":"<div><div>Gait abnormalities in Parkinson’s disease (PD) can reflect the extent of dysfunction, and making their assessment crucial for the diagnosis and treatment of PD. Current video-based methods of PD gait assessment are limited to only focusing on skeleton motion information and are confined to evaluations from a single perspective. To overcome these limitations, we propose a novel vision-skeleton dual-modality framework, which integrates keypoints vision features with skeleton motion information to enable a more accurate and comprehensive assessment of PD gait. We firstly introduce the Keypoints Vision Transformer, a novel architecture designed to extract vision features of human keypoints. This model encompasses both the spatial locations and connectivity relationships of human keypoints. Subsequently, through the proposed temporal fusion encoder, we integrate the extracted skeleton motion with keypoints vision features to enhance the extraction of temporal motion features. In a video dataset of 241 PD participants recorded from the front, our proposed framework achieves an assessment accuracy of 78.05%, which demonstrates superior performance compared to other methods. To enhance the interpretability of our method, we also conduct a feature visualization analysis of the proposed dual-modality framework, which reveal the mechanisms of different body parts and dual-modality branch in PD gait assessment. Additionally, when applied to another video dataset recorded from a more general perspective, our method still achieves a commendable accuracy of 73.07%. This achievement demonstrates the robust generalization capability of the proposed model in PD gait assessment from cross-view, which offers a novel approach for realizing unrestricted PD gait assessment in home monitoring. The latest version of the code is available at <span><span>https://github.com/FJNU-LWP/PD-gait-VSDF</span><svg><path></path></svg></span>.</div></div>","PeriodicalId":18328,"journal":{"name":"Medical image analysis","volume":"105 ","pages":"Article 103727"},"PeriodicalIF":11.8000,"publicationDate":"2025-07-07","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":"{\"title\":\"Vision-skeleton dual-modality framework for generalizable assessment of Parkinson’s disease gait\",\"authors\":\"Weiping Liu , Xiaozhen Lin , Xinghong Chen , Yifang Liu , Zengxin Zhong , Rong Chen , Guannan Chen , Yu Lin\",\"doi\":\"10.1016/j.media.2025.103727\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"<div><div>Gait abnormalities in Parkinson’s disease (PD) can reflect the extent of dysfunction, and making their assessment crucial for the diagnosis and treatment of PD. Current video-based methods of PD gait assessment are limited to only focusing on skeleton motion information and are confined to evaluations from a single perspective. To overcome these limitations, we propose a novel vision-skeleton dual-modality framework, which integrates keypoints vision features with skeleton motion information to enable a more accurate and comprehensive assessment of PD gait. We firstly introduce the Keypoints Vision Transformer, a novel architecture designed to extract vision features of human keypoints. This model encompasses both the spatial locations and connectivity relationships of human keypoints. Subsequently, through the proposed temporal fusion encoder, we integrate the extracted skeleton motion with keypoints vision features to enhance the extraction of temporal motion features. In a video dataset of 241 PD participants recorded from the front, our proposed framework achieves an assessment accuracy of 78.05%, which demonstrates superior performance compared to other methods. To enhance the interpretability of our method, we also conduct a feature visualization analysis of the proposed dual-modality framework, which reveal the mechanisms of different body parts and dual-modality branch in PD gait assessment. Additionally, when applied to another video dataset recorded from a more general perspective, our method still achieves a commendable accuracy of 73.07%. This achievement demonstrates the robust generalization capability of the proposed model in PD gait assessment from cross-view, which offers a novel approach for realizing unrestricted PD gait assessment in home monitoring. The latest version of the code is available at <span><span>https://github.com/FJNU-LWP/PD-gait-VSDF</span><svg><path></path></svg></span>.</div></div>\",\"PeriodicalId\":18328,\"journal\":{\"name\":\"Medical image analysis\",\"volume\":\"105 \",\"pages\":\"Article 103727\"},\"PeriodicalIF\":11.8000,\"publicationDate\":\"2025-07-07\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"0\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"Medical image analysis\",\"FirstCategoryId\":\"5\",\"ListUrlMain\":\"https://www.sciencedirect.com/science/article/pii/S1361841525002749\",\"RegionNum\":1,\"RegionCategory\":\"医学\",\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"Q1\",\"JCRName\":\"COMPUTER SCIENCE, ARTIFICIAL INTELLIGENCE\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"Medical image analysis","FirstCategoryId":"5","ListUrlMain":"https://www.sciencedirect.com/science/article/pii/S1361841525002749","RegionNum":1,"RegionCategory":"医学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q1","JCRName":"COMPUTER SCIENCE, ARTIFICIAL INTELLIGENCE","Score":null,"Total":0}

引用次数: 0

摘要

帕金森病（PD）的步态异常可以反映功能障碍的程度，步态异常的评估对帕金森病的诊断和治疗至关重要。目前基于视频的PD步态评估方法仅限于关注骨骼运动信息，并且仅限于从单一角度进行评估。为了克服这些限制，我们提出了一种新的视觉-骨骼双模态框架，该框架将关键点视觉特征与骨骼运动信息相结合，从而能够更准确、更全面地评估PD步态。首先介绍了关键点视觉转换器，这是一种用于提取人体关键点视觉特征的新架构。该模型包含了人类关键点的空间位置和连通性关系。随后，通过提出的时间融合编码器，将提取的骨架运动与关键点视觉特征相结合，增强时间运动特征的提取。在前线录制的241名PD参与者的视频数据集中，我们提出的框架的评估准确率达到78.05%，与其他方法相比表现出优越的性能。为了提高我们方法的可解释性，我们还对所提出的双模态框架进行了特征可视化分析，揭示了不同身体部位和双模态分支在PD步态评估中的作用机制。此外，当应用于从更一般的角度记录的另一个视频数据集时，我们的方法仍然达到了值得称赞的73.07%的准确率。这一成果证明了该模型在PD步态评估中的鲁棒泛化能力，为实现家庭监测中PD步态评估提供了一种新的方法。最新版本的代码可从https://github.com/FJNU-LWP/PD-gait-VSDF获得。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

查看原文本刊更多论文

Vision-skeleton dual-modality framework for generalizable assessment of Parkinson’s disease gait

Gait abnormalities in Parkinson’s disease (PD) can reflect the extent of dysfunction, and making their assessment crucial for the diagnosis and treatment of PD. Current video-based methods of PD gait assessment are limited to only focusing on skeleton motion information and are confined to evaluations from a single perspective. To overcome these limitations, we propose a novel vision-skeleton dual-modality framework, which integrates keypoints vision features with skeleton motion information to enable a more accurate and comprehensive assessment of PD gait. We firstly introduce the Keypoints Vision Transformer, a novel architecture designed to extract vision features of human keypoints. This model encompasses both the spatial locations and connectivity relationships of human keypoints. Subsequently, through the proposed temporal fusion encoder, we integrate the extracted skeleton motion with keypoints vision features to enhance the extraction of temporal motion features. In a video dataset of 241 PD participants recorded from the front, our proposed framework achieves an assessment accuracy of 78.05%, which demonstrates superior performance compared to other methods. To enhance the interpretability of our method, we also conduct a feature visualization analysis of the proposed dual-modality framework, which reveal the mechanisms of different body parts and dual-modality branch in PD gait assessment. Additionally, when applied to another video dataset recorded from a more general perspective, our method still achieves a commendable accuracy of 73.07%. This achievement demonstrates the robust generalization capability of the proposed model in PD gait assessment from cross-view, which offers a novel approach for realizing unrestricted PD gait assessment in home monitoring. The latest version of the code is available at https://github.com/FJNU-LWP/PD-gait-VSDF.

求助全文

通过发布文献求助，成功后即可免费获取论文全文。去求助

来源期刊

Medical image analysis 工程技术-工程：生物医学

CiteScore

22.10

自引率

6.40%

发文量

309

审稿时长

6.6 months

期刊介绍： Medical Image Analysis serves as a platform for sharing new research findings in the realm of medical and biological image analysis, with a focus on applications of computer vision, virtual reality, and robotics to biomedical imaging challenges. The journal prioritizes the publication of high-quality, original papers contributing to the fundamental science of processing, analyzing, and utilizing medical and biological images. It welcomes approaches utilizing biomedical image datasets across all spatial scales, from molecular/cellular imaging to tissue/organ imaging.