Weiping Liu , Xiaozhen Lin , Xinghong Chen , Yifang Liu , Zengxin Zhong , Rong Chen , Guannan Chen , Yu Lin
{"title":"帕金森病步态的视觉-骨骼双模态评估框架","authors":"Weiping Liu , Xiaozhen Lin , Xinghong Chen , Yifang Liu , Zengxin Zhong , Rong Chen , Guannan Chen , Yu Lin","doi":"10.1016/j.media.2025.103727","DOIUrl":null,"url":null,"abstract":"<div><div>Gait abnormalities in Parkinson’s disease (PD) can reflect the extent of dysfunction, and making their assessment crucial for the diagnosis and treatment of PD. Current video-based methods of PD gait assessment are limited to only focusing on skeleton motion information and are confined to evaluations from a single perspective. To overcome these limitations, we propose a novel vision-skeleton dual-modality framework, which integrates keypoints vision features with skeleton motion information to enable a more accurate and comprehensive assessment of PD gait. We firstly introduce the Keypoints Vision Transformer, a novel architecture designed to extract vision features of human keypoints. This model encompasses both the spatial locations and connectivity relationships of human keypoints. Subsequently, through the proposed temporal fusion encoder, we integrate the extracted skeleton motion with keypoints vision features to enhance the extraction of temporal motion features. In a video dataset of 241 PD participants recorded from the front, our proposed framework achieves an assessment accuracy of 78.05%, which demonstrates superior performance compared to other methods. To enhance the interpretability of our method, we also conduct a feature visualization analysis of the proposed dual-modality framework, which reveal the mechanisms of different body parts and dual-modality branch in PD gait assessment. Additionally, when applied to another video dataset recorded from a more general perspective, our method still achieves a commendable accuracy of 73.07%. This achievement demonstrates the robust generalization capability of the proposed model in PD gait assessment from cross-view, which offers a novel approach for realizing unrestricted PD gait assessment in home monitoring. The latest version of the code is available at <span><span>https://github.com/FJNU-LWP/PD-gait-VSDF</span><svg><path></path></svg></span>.</div></div>","PeriodicalId":18328,"journal":{"name":"Medical image analysis","volume":"105 ","pages":"Article 103727"},"PeriodicalIF":11.8000,"publicationDate":"2025-07-07","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":"{\"title\":\"Vision-skeleton dual-modality framework for generalizable assessment of Parkinson’s disease gait\",\"authors\":\"Weiping Liu , Xiaozhen Lin , Xinghong Chen , Yifang Liu , Zengxin Zhong , Rong Chen , Guannan Chen , Yu Lin\",\"doi\":\"10.1016/j.media.2025.103727\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"<div><div>Gait abnormalities in Parkinson’s disease (PD) can reflect the extent of dysfunction, and making their assessment crucial for the diagnosis and treatment of PD. Current video-based methods of PD gait assessment are limited to only focusing on skeleton motion information and are confined to evaluations from a single perspective. To overcome these limitations, we propose a novel vision-skeleton dual-modality framework, which integrates keypoints vision features with skeleton motion information to enable a more accurate and comprehensive assessment of PD gait. We firstly introduce the Keypoints Vision Transformer, a novel architecture designed to extract vision features of human keypoints. This model encompasses both the spatial locations and connectivity relationships of human keypoints. Subsequently, through the proposed temporal fusion encoder, we integrate the extracted skeleton motion with keypoints vision features to enhance the extraction of temporal motion features. In a video dataset of 241 PD participants recorded from the front, our proposed framework achieves an assessment accuracy of 78.05%, which demonstrates superior performance compared to other methods. To enhance the interpretability of our method, we also conduct a feature visualization analysis of the proposed dual-modality framework, which reveal the mechanisms of different body parts and dual-modality branch in PD gait assessment. Additionally, when applied to another video dataset recorded from a more general perspective, our method still achieves a commendable accuracy of 73.07%. This achievement demonstrates the robust generalization capability of the proposed model in PD gait assessment from cross-view, which offers a novel approach for realizing unrestricted PD gait assessment in home monitoring. The latest version of the code is available at <span><span>https://github.com/FJNU-LWP/PD-gait-VSDF</span><svg><path></path></svg></span>.</div></div>\",\"PeriodicalId\":18328,\"journal\":{\"name\":\"Medical image analysis\",\"volume\":\"105 \",\"pages\":\"Article 103727\"},\"PeriodicalIF\":11.8000,\"publicationDate\":\"2025-07-07\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"0\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"Medical image analysis\",\"FirstCategoryId\":\"5\",\"ListUrlMain\":\"https://www.sciencedirect.com/science/article/pii/S1361841525002749\",\"RegionNum\":1,\"RegionCategory\":\"医学\",\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"Q1\",\"JCRName\":\"COMPUTER SCIENCE, ARTIFICIAL INTELLIGENCE\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"Medical image analysis","FirstCategoryId":"5","ListUrlMain":"https://www.sciencedirect.com/science/article/pii/S1361841525002749","RegionNum":1,"RegionCategory":"医学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q1","JCRName":"COMPUTER SCIENCE, ARTIFICIAL INTELLIGENCE","Score":null,"Total":0}
Vision-skeleton dual-modality framework for generalizable assessment of Parkinson’s disease gait
Gait abnormalities in Parkinson’s disease (PD) can reflect the extent of dysfunction, and making their assessment crucial for the diagnosis and treatment of PD. Current video-based methods of PD gait assessment are limited to only focusing on skeleton motion information and are confined to evaluations from a single perspective. To overcome these limitations, we propose a novel vision-skeleton dual-modality framework, which integrates keypoints vision features with skeleton motion information to enable a more accurate and comprehensive assessment of PD gait. We firstly introduce the Keypoints Vision Transformer, a novel architecture designed to extract vision features of human keypoints. This model encompasses both the spatial locations and connectivity relationships of human keypoints. Subsequently, through the proposed temporal fusion encoder, we integrate the extracted skeleton motion with keypoints vision features to enhance the extraction of temporal motion features. In a video dataset of 241 PD participants recorded from the front, our proposed framework achieves an assessment accuracy of 78.05%, which demonstrates superior performance compared to other methods. To enhance the interpretability of our method, we also conduct a feature visualization analysis of the proposed dual-modality framework, which reveal the mechanisms of different body parts and dual-modality branch in PD gait assessment. Additionally, when applied to another video dataset recorded from a more general perspective, our method still achieves a commendable accuracy of 73.07%. This achievement demonstrates the robust generalization capability of the proposed model in PD gait assessment from cross-view, which offers a novel approach for realizing unrestricted PD gait assessment in home monitoring. The latest version of the code is available at https://github.com/FJNU-LWP/PD-gait-VSDF.
期刊介绍:
Medical Image Analysis serves as a platform for sharing new research findings in the realm of medical and biological image analysis, with a focus on applications of computer vision, virtual reality, and robotics to biomedical imaging challenges. The journal prioritizes the publication of high-quality, original papers contributing to the fundamental science of processing, analyzing, and utilizing medical and biological images. It welcomes approaches utilizing biomedical image datasets across all spatial scales, from molecular/cellular imaging to tissue/organ imaging.