Michael B Bone, Morris Freedman, Sandra E Black, Daniel Felsky, Sanjeev Kumar, Bradley Pugh, Stephen C Strother, David F Tang-Wai, Maria Carmela Tartaglia, Bradley R Buchsbaum
{"title":"使用时钟绘制测试图像进行全自动和可扩展的痴呆症筛查的视觉转换方法。","authors":"Michael B Bone, Morris Freedman, Sandra E Black, Daniel Felsky, Sanjeev Kumar, Bradley Pugh, Stephen C Strother, David F Tang-Wai, Maria Carmela Tartaglia, Bradley R Buchsbaum","doi":"10.1002/dad2.70171","DOIUrl":null,"url":null,"abstract":"<p><strong>Introduction: </strong>The clock drawing test (CDT) screens for dementia but requires trained scorers and lacks standardized criteria. Thus, we developed an automated vision transformer (ViT)-based diagnostic system with convolutional neural network preprocessing for analyzing hand-drawn CDT images.</p><p><strong>Methods: </strong>The architecture implements fine-tuned ViT feature extraction with linear classification for dementia prediction. Training used the National Health and Aging Trends Study (NHATS) dataset (<i>n</i> = 54,027), with testing on an independent clinical cohort from the Toronto Dementia Research Alliance (TDRA; <i>n</i> = 862; 522 dementia, 340 normal cognition).</p><p><strong>Results: </strong>The ViT approach predicted dementia with 76.5% balanced accuracy, outperforming human-scored features (74.3%) and existing deep learning models (MiniVGG = 73.3%, MobileNetV2 = 72.3%, relevance factor variational autoencoder = 69.1%) on the TDRA dataset.</p><p><strong>Discussion: </strong>This pen-and-paper compatible diagnostic system enables scalable remote cognitive screening through automated CDT image analysis that is competitive with human-scored features, potentially increasing diagnostic accessibility for diverse populations across varied socioeconomic contexts.</p><p><strong>Highlights: </strong>The vision transformer model achieves 76.5% accuracy in dementia detection from clock drawing tests, outperforming human scoring and existing deep learning methods.Novel convolutional neural network-based preprocessing automatically handles challenging image quality issues like shadows, irrelevant markings, and improper cropping.The system requires only a photo of a hand-drawn clock test, enabling scalable remote screening accessible across socioeconomic contexts.A feature-extraction model trained on 54,027 samples demonstrates robust generalization to an independent clinical dataset of 862 patients.This fully automated approach eliminates the need for trained scorers while maintaining diagnostic accuracy above manual methods.</p>","PeriodicalId":53226,"journal":{"name":"Alzheimer''s and Dementia: Diagnosis, Assessment and Disease Monitoring","volume":"17 3","pages":"e70171"},"PeriodicalIF":4.4000,"publicationDate":"2025-09-23","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC12457074/pdf/","citationCount":"0","resultStr":"{\"title\":\"A vision transformer approach for fully automated and scalable dementia screening using clock drawing test images.\",\"authors\":\"Michael B Bone, Morris Freedman, Sandra E Black, Daniel Felsky, Sanjeev Kumar, Bradley Pugh, Stephen C Strother, David F Tang-Wai, Maria Carmela Tartaglia, Bradley R Buchsbaum\",\"doi\":\"10.1002/dad2.70171\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"<p><strong>Introduction: </strong>The clock drawing test (CDT) screens for dementia but requires trained scorers and lacks standardized criteria. Thus, we developed an automated vision transformer (ViT)-based diagnostic system with convolutional neural network preprocessing for analyzing hand-drawn CDT images.</p><p><strong>Methods: </strong>The architecture implements fine-tuned ViT feature extraction with linear classification for dementia prediction. Training used the National Health and Aging Trends Study (NHATS) dataset (<i>n</i> = 54,027), with testing on an independent clinical cohort from the Toronto Dementia Research Alliance (TDRA; <i>n</i> = 862; 522 dementia, 340 normal cognition).</p><p><strong>Results: </strong>The ViT approach predicted dementia with 76.5% balanced accuracy, outperforming human-scored features (74.3%) and existing deep learning models (MiniVGG = 73.3%, MobileNetV2 = 72.3%, relevance factor variational autoencoder = 69.1%) on the TDRA dataset.</p><p><strong>Discussion: </strong>This pen-and-paper compatible diagnostic system enables scalable remote cognitive screening through automated CDT image analysis that is competitive with human-scored features, potentially increasing diagnostic accessibility for diverse populations across varied socioeconomic contexts.</p><p><strong>Highlights: </strong>The vision transformer model achieves 76.5% accuracy in dementia detection from clock drawing tests, outperforming human scoring and existing deep learning methods.Novel convolutional neural network-based preprocessing automatically handles challenging image quality issues like shadows, irrelevant markings, and improper cropping.The system requires only a photo of a hand-drawn clock test, enabling scalable remote screening accessible across socioeconomic contexts.A feature-extraction model trained on 54,027 samples demonstrates robust generalization to an independent clinical dataset of 862 patients.This fully automated approach eliminates the need for trained scorers while maintaining diagnostic accuracy above manual methods.</p>\",\"PeriodicalId\":53226,\"journal\":{\"name\":\"Alzheimer''s and Dementia: Diagnosis, Assessment and Disease Monitoring\",\"volume\":\"17 3\",\"pages\":\"e70171\"},\"PeriodicalIF\":4.4000,\"publicationDate\":\"2025-09-23\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC12457074/pdf/\",\"citationCount\":\"0\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"Alzheimer''s and Dementia: Diagnosis, Assessment and Disease Monitoring\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/10.1002/dad2.70171\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"2025/7/1 0:00:00\",\"PubModel\":\"eCollection\",\"JCR\":\"Q1\",\"JCRName\":\"CLINICAL NEUROLOGY\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"Alzheimer''s and Dementia: Diagnosis, Assessment and Disease Monitoring","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1002/dad2.70171","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"2025/7/1 0:00:00","PubModel":"eCollection","JCR":"Q1","JCRName":"CLINICAL NEUROLOGY","Score":null,"Total":0}
A vision transformer approach for fully automated and scalable dementia screening using clock drawing test images.
Introduction: The clock drawing test (CDT) screens for dementia but requires trained scorers and lacks standardized criteria. Thus, we developed an automated vision transformer (ViT)-based diagnostic system with convolutional neural network preprocessing for analyzing hand-drawn CDT images.
Methods: The architecture implements fine-tuned ViT feature extraction with linear classification for dementia prediction. Training used the National Health and Aging Trends Study (NHATS) dataset (n = 54,027), with testing on an independent clinical cohort from the Toronto Dementia Research Alliance (TDRA; n = 862; 522 dementia, 340 normal cognition).
Results: The ViT approach predicted dementia with 76.5% balanced accuracy, outperforming human-scored features (74.3%) and existing deep learning models (MiniVGG = 73.3%, MobileNetV2 = 72.3%, relevance factor variational autoencoder = 69.1%) on the TDRA dataset.
Discussion: This pen-and-paper compatible diagnostic system enables scalable remote cognitive screening through automated CDT image analysis that is competitive with human-scored features, potentially increasing diagnostic accessibility for diverse populations across varied socioeconomic contexts.
Highlights: The vision transformer model achieves 76.5% accuracy in dementia detection from clock drawing tests, outperforming human scoring and existing deep learning methods.Novel convolutional neural network-based preprocessing automatically handles challenging image quality issues like shadows, irrelevant markings, and improper cropping.The system requires only a photo of a hand-drawn clock test, enabling scalable remote screening accessible across socioeconomic contexts.A feature-extraction model trained on 54,027 samples demonstrates robust generalization to an independent clinical dataset of 862 patients.This fully automated approach eliminates the need for trained scorers while maintaining diagnostic accuracy above manual methods.
期刊介绍:
Alzheimer''s & Dementia: Diagnosis, Assessment & Disease Monitoring (DADM) is an open access, peer-reviewed, journal from the Alzheimer''s Association® that will publish new research that reports the discovery, development and validation of instruments, technologies, algorithms, and innovative processes. Papers will cover a range of topics interested in the early and accurate detection of individuals with memory complaints and/or among asymptomatic individuals at elevated risk for various forms of memory disorders. The expectation for published papers will be to translate fundamental knowledge about the neurobiology of the disease into practical reports that describe both the conceptual and methodological aspects of the submitted scientific inquiry. Published topics will explore the development of biomarkers, surrogate markers, and conceptual/methodological challenges. Publication priority will be given to papers that 1) describe putative surrogate markers that accurately track disease progression, 2) biomarkers that fulfill international regulatory requirements, 3) reports from large, well-characterized population-based cohorts that comprise the heterogeneity and diversity of asymptomatic individuals and 4) algorithmic development that considers multi-marker arrays (e.g., integrated-omics, genetics, biofluids, imaging, etc.) and advanced computational analytics and technologies.