Andrius Lauraitis, Armantas Ostreika, Gintaras Palubeckis, Liudas Motiejunas
{"title":"Automatic motor and visuospatial cognition screening with ensemble learning: A computerised clock drawing test approach","authors":"Andrius Lauraitis, Armantas Ostreika, Gintaras Palubeckis, Liudas Motiejunas","doi":"10.1016/j.compbiomed.2025.111107","DOIUrl":null,"url":null,"abstract":"<div><div>We propose a supervised ensemble learning-based approach to evaluate the significance of the digitised analogue clock drawing test (CDT) for the detection of neural impairments in patients with early-stage central nervous system disorders (CNSD). The research findings are based on the data samples that have been collected using the clock construction task of the Neural Impairment Test Suite (NITS) mobile application from 15 test subjects (including Huntington Disease (HD), Parkinson Disease (PD), cerebral palsy (CP), post-stroke, early dementia and control groups) during a pilot study in Lithuania. This work examines finger motion tracking (FMT) on a mobile device and the detection of potential inability of CNSD patients to accurately copy benchmark clock drawings without a pre-drawn clock contour circle, focusing on multimodal (datasets of FMT samples and CDT images) neural impairment screening. Considering the small size of the originally gathered imbalanced datasets, as pre-processing routines, Synthetic Minority Oversampling Technique (SMOTE) was used for the FMT augmentation, and the geometric image transformations (rotation, flip, zoom) were applied for the augmentation of CDT drawings.</div><div>The following methods for feature extraction are used regarding the FMT and CDT image datasets accordingly: 1) average finger speed while moving on the surface, finger velocity, magnitude of the rate at which finger tap changes its position, standard deviation (SD) of velocity, rate at which finger velocity changes, maximum finger acceleration, finger position change count, average finger screen pressure and touch area ratio (in range [0; 1]), total time duration (in seconds); 2) Edge Histogram Filter (EHD), Pyramid Histogram of Oriented Gradients (PHOG), Gabor wavelet and their fusion.</div><div>Two experiments (E1, E2) were conducted to solve healthy vs. impaired binary classification problem. The nature of E1 design that is tracking motor impairments in CNSD and detecting cognitive impairments is targeted in E2. All classifiers (K-NN, Naïve Bayes, ANN, SMO, SVM and their ensembles) were tested with a 5-fold stratified cross-validation procedure, and the performances of classification models were evaluated by accuracy, balanced accuracy (BA), F1 score, sensitivity, specificity, kappa, receiver-operating characteristic area under the curve (AUC-ROC), mean absolute error (MAE), root mean squared error (RMSE) metrics. The Principal Component Analysis (PCA) method was used for the dimensionality reduction in high-dimensional image feature vectors. The overfitting of models was addressed by comparing the learning curves (training and validation sets). Results: 1) in E1, the highest 99.20 % accuracy precision (boosted SMO algorithm with PuK kernel) was achieved on SMOTE synthesised FMT train set and 99.40 % accuracy on FMT test set; 2) in E2 (augmented dataset of CDT images), the highest 97.96 % accuracy (94.90 % on test set) was achieved with ensemble of features (EHD, PHOG, Gabor) and KNN + AdaBoost (Naïve Bayes) + AdaBoost (SVM) majority vote classifier ensemble.</div></div>","PeriodicalId":10578,"journal":{"name":"Computers in biology and medicine","volume":"197 ","pages":"Article 111107"},"PeriodicalIF":6.3000,"publicationDate":"2025-09-20","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Computers in biology and medicine","FirstCategoryId":"5","ListUrlMain":"https://www.sciencedirect.com/science/article/pii/S0010482525014593","RegionNum":2,"RegionCategory":"医学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q1","JCRName":"BIOLOGY","Score":null,"Total":0}
引用次数: 0
Abstract
We propose a supervised ensemble learning-based approach to evaluate the significance of the digitised analogue clock drawing test (CDT) for the detection of neural impairments in patients with early-stage central nervous system disorders (CNSD). The research findings are based on the data samples that have been collected using the clock construction task of the Neural Impairment Test Suite (NITS) mobile application from 15 test subjects (including Huntington Disease (HD), Parkinson Disease (PD), cerebral palsy (CP), post-stroke, early dementia and control groups) during a pilot study in Lithuania. This work examines finger motion tracking (FMT) on a mobile device and the detection of potential inability of CNSD patients to accurately copy benchmark clock drawings without a pre-drawn clock contour circle, focusing on multimodal (datasets of FMT samples and CDT images) neural impairment screening. Considering the small size of the originally gathered imbalanced datasets, as pre-processing routines, Synthetic Minority Oversampling Technique (SMOTE) was used for the FMT augmentation, and the geometric image transformations (rotation, flip, zoom) were applied for the augmentation of CDT drawings.
The following methods for feature extraction are used regarding the FMT and CDT image datasets accordingly: 1) average finger speed while moving on the surface, finger velocity, magnitude of the rate at which finger tap changes its position, standard deviation (SD) of velocity, rate at which finger velocity changes, maximum finger acceleration, finger position change count, average finger screen pressure and touch area ratio (in range [0; 1]), total time duration (in seconds); 2) Edge Histogram Filter (EHD), Pyramid Histogram of Oriented Gradients (PHOG), Gabor wavelet and their fusion.
Two experiments (E1, E2) were conducted to solve healthy vs. impaired binary classification problem. The nature of E1 design that is tracking motor impairments in CNSD and detecting cognitive impairments is targeted in E2. All classifiers (K-NN, Naïve Bayes, ANN, SMO, SVM and their ensembles) were tested with a 5-fold stratified cross-validation procedure, and the performances of classification models were evaluated by accuracy, balanced accuracy (BA), F1 score, sensitivity, specificity, kappa, receiver-operating characteristic area under the curve (AUC-ROC), mean absolute error (MAE), root mean squared error (RMSE) metrics. The Principal Component Analysis (PCA) method was used for the dimensionality reduction in high-dimensional image feature vectors. The overfitting of models was addressed by comparing the learning curves (training and validation sets). Results: 1) in E1, the highest 99.20 % accuracy precision (boosted SMO algorithm with PuK kernel) was achieved on SMOTE synthesised FMT train set and 99.40 % accuracy on FMT test set; 2) in E2 (augmented dataset of CDT images), the highest 97.96 % accuracy (94.90 % on test set) was achieved with ensemble of features (EHD, PHOG, Gabor) and KNN + AdaBoost (Naïve Bayes) + AdaBoost (SVM) majority vote classifier ensemble.
期刊介绍:
Computers in Biology and Medicine is an international forum for sharing groundbreaking advancements in the use of computers in bioscience and medicine. This journal serves as a medium for communicating essential research, instruction, ideas, and information regarding the rapidly evolving field of computer applications in these domains. By encouraging the exchange of knowledge, we aim to facilitate progress and innovation in the utilization of computers in biology and medicine.