Ana Rita Cóias;Min Hun Lee;Alexandre Bernardino;Asim Smailagic;Mariana Mateus;David Fernandes;Sofia Trapola
{"title":"Learning Frame-Level Classifiers for Video-Based Real-Time Assessment of Stroke Rehabilitation Exercises From Weakly Annotated Datasets","authors":"Ana Rita Cóias;Min Hun Lee;Alexandre Bernardino;Asim Smailagic;Mariana Mateus;David Fernandes;Sofia Trapola","doi":"10.1109/TNSRE.2025.3602548","DOIUrl":null,"url":null,"abstract":"Autonomous rehabilitation support solutions, such as virtual coaches, should provide real-time feedback to improve motor function and maintain patient engagement. However, fully annotated dataset collection for real-time exercise assessment is time-consuming and costly, posing a barrier to evaluating proposed methods. In this work, we present a novel framework that learns a frame-level classifier using weakly annotated videos for real-time assessment of compensatory motions in stroke rehabilitation exercises by generating pseudo-labels at a frame level. We consider three approaches: 1) a baseline approach that uses a source dataset to train a frame-level classifier, 2) a transfer learning approach that uses target dataset video-level labels and parameters learned from a source dataset with frame-level labels, and 3) a semi-supervised approach that leverages a target dataset video-level labels and a small set of frame-level labels. We intend to generalize to a weakly labeled target dataset with new exercises and patients. To validate the approach, we use two datasets annotated on compensatory motions: TULE, an existing video and frame-level labeled dataset of 15 post-stroke patients and three exercises, and SERE, a new dataset of 20 post-stroke patients and five exercises, created by the authors, with video-level labels and a small amount of frame-level labels. We show that a frame-level classifier trained on TULE does not generalize well on SERE (<inline-formula> <tex-math>${f}_{{1}} = {72}.{87}\\%$ </tex-math></inline-formula>), but our semi-supervised and transfer learning approaches achieve, respectively, <inline-formula> <tex-math>${f}_{{1}} = {78}.{93}\\%$ </tex-math></inline-formula> and <inline-formula> <tex-math>${f}_{{1}} = {80}.{47}\\%$ </tex-math></inline-formula>. Generating pseudo-labels leads to better frame-level classification results for the target dataset than training a classifier with the source dataset (baseline). Thus, the proposed approach can simplify the customization of virtual coaches to new patients and exercises with low data annotation efforts.","PeriodicalId":13419,"journal":{"name":"IEEE Transactions on Neural Systems and Rehabilitation Engineering","volume":"33 ","pages":"3334-3345"},"PeriodicalIF":5.2000,"publicationDate":"2025-08-25","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://ieeexplore.ieee.org/stamp/stamp.jsp?tp=&arnumber=11141498","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"IEEE Transactions on Neural Systems and Rehabilitation Engineering","FirstCategoryId":"5","ListUrlMain":"https://ieeexplore.ieee.org/document/11141498/","RegionNum":2,"RegionCategory":"医学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q2","JCRName":"ENGINEERING, BIOMEDICAL","Score":null,"Total":0}
引用次数: 0
Abstract
Autonomous rehabilitation support solutions, such as virtual coaches, should provide real-time feedback to improve motor function and maintain patient engagement. However, fully annotated dataset collection for real-time exercise assessment is time-consuming and costly, posing a barrier to evaluating proposed methods. In this work, we present a novel framework that learns a frame-level classifier using weakly annotated videos for real-time assessment of compensatory motions in stroke rehabilitation exercises by generating pseudo-labels at a frame level. We consider three approaches: 1) a baseline approach that uses a source dataset to train a frame-level classifier, 2) a transfer learning approach that uses target dataset video-level labels and parameters learned from a source dataset with frame-level labels, and 3) a semi-supervised approach that leverages a target dataset video-level labels and a small set of frame-level labels. We intend to generalize to a weakly labeled target dataset with new exercises and patients. To validate the approach, we use two datasets annotated on compensatory motions: TULE, an existing video and frame-level labeled dataset of 15 post-stroke patients and three exercises, and SERE, a new dataset of 20 post-stroke patients and five exercises, created by the authors, with video-level labels and a small amount of frame-level labels. We show that a frame-level classifier trained on TULE does not generalize well on SERE (${f}_{{1}} = {72}.{87}\%$ ), but our semi-supervised and transfer learning approaches achieve, respectively, ${f}_{{1}} = {78}.{93}\%$ and ${f}_{{1}} = {80}.{47}\%$ . Generating pseudo-labels leads to better frame-level classification results for the target dataset than training a classifier with the source dataset (baseline). Thus, the proposed approach can simplify the customization of virtual coaches to new patients and exercises with low data annotation efforts.
期刊介绍:
Rehabilitative and neural aspects of biomedical engineering, including functional electrical stimulation, acoustic dynamics, human performance measurement and analysis, nerve stimulation, electromyography, motor control and stimulation; and hardware and software applications for rehabilitation engineering and assistive devices.