Comparison of machine learning algorithms for predicting cognitive impairment using neuropsychological tests.

IF 1.4 4区心理学 Q4 CLINICAL NEUROLOGY

Applied Neuropsychology-Adult Pub Date : 2024-09-09 DOI:10.1080/23279095.2024.2392282

Chanda Simfukwe, Seong Soo A An, Young Chul Youn

{"title":"Comparison of machine learning algorithms for predicting cognitive impairment using neuropsychological tests.","authors":"Chanda Simfukwe, Seong Soo A An, Young Chul Youn","doi":"10.1080/23279095.2024.2392282","DOIUrl":null,"url":null,"abstract":"Objectives: Neuropsychological tests (NPTs) are standard tools for assessing cognitive function. These tools can evaluate the cognitive status of a subject, which can be time-consuming and expensive for interpretation. Therefore, this paper aimed to optimize the systematic NPTs by machine learning and develop new classification models for differentiating healthy controls (HC), mild cognitive impairment, and Alzheimer's disease dementia (ADD) among groups of subjects.Patients and methods: A total dataset of 14,926 subjects was obtained from the formal 46 NPTs based on the Seoul Neuropsychological Screening Battery (SNSB). The statistical values of the dataset included an age of 70.18 ± 7.13 with an education level of 8.18 ± 5.50 and a diagnosis group of three; HC, MCI, and ADD. The dataset was preprocessed and classified in two- and three-way machine-learning classification from scikit-learn (www.scikit-learn.org) to differentiate between HC versus MCI, HC versus ADD, HC versus Cognitive Impairment (CI) (MCI + ADD), and HC versus MCI versus ADD. We compared the performance of seven machine learning algorithms, including Naïve Bayes (NB), random forest (RF), decision tree (DT), k-nearest neighbors (KNN), support vector machine (SVM), AdaBoost, and linear discriminant analysis (LDA). The accuracy, sensitivity, specificity, positive predicted value (PPV), negative predictive value (NPV), area under the curve (AUC), confusion matrixes, and receiver operating characteristic (ROC) were obtained from each model based on the test dataset.Results: The trained models based on 29 best-selected NPT features were evaluated, the model with the RF algorithm yielded the best accuracy, sensitivity, specificity, PPV, NPV, and AUC in all four models: HC versus MCI was 98%, 98%, 97%, 98%, 97%, and 99%; HC versus ADD was 98%, 99%, 96%, 97%, 98%, and 99%; HC versus CI was 97%, 99%, 92%, 97%, 97%, and 99% and HC versus MCI versus ADD was 97%, 96%, 98%, 97%, 98%, and 99%, respectively, in predicting of cognitive impairment among subjects.Conclusion: According to the results, the RF algorithm was the best classification model for both two- and three-way classification among the seven algorithms trained on an imbalanced NPTs SNSB dataset. The trained models proved useful for diagnosing MCI and ADD in patients with normal NPTs. These models can optimize cognitive evaluation, enhance diagnostic accuracy, and reduce missed diagnoses.","PeriodicalId":51308,"journal":{"name":"Applied Neuropsychology-Adult","volume":null,"pages":null},"PeriodicalIF":1.4000,"publicationDate":"2024-09-09","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Applied Neuropsychology-Adult","FirstCategoryId":"102","ListUrlMain":"https://doi.org/10.1080/23279095.2024.2392282","RegionNum":4,"RegionCategory":"心理学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q4","JCRName":"CLINICAL NEUROLOGY","Score":null,"Total":0}

引用次数: 0

Abstract

Objectives: Neuropsychological tests (NPTs) are standard tools for assessing cognitive function. These tools can evaluate the cognitive status of a subject, which can be time-consuming and expensive for interpretation. Therefore, this paper aimed to optimize the systematic NPTs by machine learning and develop new classification models for differentiating healthy controls (HC), mild cognitive impairment, and Alzheimer's disease dementia (ADD) among groups of subjects.

Patients and methods: A total dataset of 14,926 subjects was obtained from the formal 46 NPTs based on the Seoul Neuropsychological Screening Battery (SNSB). The statistical values of the dataset included an age of 70.18 ± 7.13 with an education level of 8.18 ± 5.50 and a diagnosis group of three; HC, MCI, and ADD. The dataset was preprocessed and classified in two- and three-way machine-learning classification from scikit-learn (www.scikit-learn.org) to differentiate between HC versus MCI, HC versus ADD, HC versus Cognitive Impairment (CI) (MCI + ADD), and HC versus MCI versus ADD. We compared the performance of seven machine learning algorithms, including Naïve Bayes (NB), random forest (RF), decision tree (DT), k-nearest neighbors (KNN), support vector machine (SVM), AdaBoost, and linear discriminant analysis (LDA). The accuracy, sensitivity, specificity, positive predicted value (PPV), negative predictive value (NPV), area under the curve (AUC), confusion matrixes, and receiver operating characteristic (ROC) were obtained from each model based on the test dataset.

Results: The trained models based on 29 best-selected NPT features were evaluated, the model with the RF algorithm yielded the best accuracy, sensitivity, specificity, PPV, NPV, and AUC in all four models: HC versus MCI was 98%, 98%, 97%, 98%, 97%, and 99%; HC versus ADD was 98%, 99%, 96%, 97%, 98%, and 99%; HC versus CI was 97%, 99%, 92%, 97%, 97%, and 99% and HC versus MCI versus ADD was 97%, 96%, 98%, 97%, 98%, and 99%, respectively, in predicting of cognitive impairment among subjects.

Conclusion: According to the results, the RF algorithm was the best classification model for both two- and three-way classification among the seven algorithms trained on an imbalanced NPTs SNSB dataset. The trained models proved useful for diagnosing MCI and ADD in patients with normal NPTs. These models can optimize cognitive evaluation, enhance diagnostic accuracy, and reduce missed diagnoses.

查看原文本刊更多论文

利用神经心理学测试预测认知障碍的机器学习算法比较。

目的：神经心理学测试（NPT）是评估认知功能的标准工具。这些工具可以评估受试者的认知状况，但解释起来耗时费钱。因此，本文旨在通过机器学习优化系统化的 NPTs，并开发新的分类模型来区分健康对照组（HC）、轻度认知障碍组和阿尔茨海默病痴呆组（ADD）：从基于首尔神经心理筛查电池（SNSB）的正式 46 项 NPT 中获得了共计 14,926 名受试者的数据集。数据集的统计值包括年龄（70.18 ± 7.13）、教育程度（8.18 ± 5.50）和三个诊断组（HC、MCI 和 ADD）。我们对数据集进行了预处理，并使用 scikit-learn (www.scikit-learn.org) 进行了两向和三向机器学习分类，以区分 HC 与 MCI、HC 与 ADD、HC 与认知障碍 (CI) (MCI + ADD) 以及 HC 与 MCI 与 ADD。我们比较了七种机器学习算法的性能，包括奈夫贝叶斯（NB）、随机森林（RF）、决策树（DT）、k-近邻（KNN）、支持向量机（SVM）、AdaBoost 和线性判别分析（LDA）。根据测试数据集得出了每个模型的准确性、灵敏度、特异性、阳性预测值（PPV）、阴性预测值（NPV）、曲线下面积（AUC）、混淆矩阵和接收者操作特征（ROC）：对基于 29 个最佳选择的 NPT 特征训练的模型进行了评估，在所有四个模型中，采用 RF 算法的模型获得了最佳的准确性、灵敏度、特异性、PPV、NPV 和 AUC：HC相对于MCI的预测准确率分别为98%、98%、97%、98%、97%和99%；HC相对于ADD的预测准确率分别为98%、99%、96%、97%、98%和99%；HC相对于CI的预测准确率分别为97%、99%、92%、97%、97%和99%；HC相对于MCI相对于ADD的预测准确率分别为97%、96%、98%、97%、98%和99%：根据研究结果，在不平衡的 NPTs SNSB 数据集上训练的七种算法中，RF 算法是双向和三向分类的最佳分类模型。事实证明，训练出的模型有助于诊断 NPT 正常患者的 MCI 和 ADD。这些模型可以优化认知评估、提高诊断准确性并减少漏诊。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

求助全文

约1分钟内获得全文求助全文

来源期刊

Applied Neuropsychology-Adult CLINICAL NEUROLOGY-PSYCHOLOGY

CiteScore

4.50

自引率

11.80%

发文量

134

期刊介绍： pplied Neuropsychology-Adult publishes clinical neuropsychological articles concerning assessment, brain functioning and neuroimaging, neuropsychological treatment, and rehabilitation in adults. Full-length articles and brief communications are included. Case studies of adult patients carefully assessing the nature, course, or treatment of clinical neuropsychological dysfunctions in the context of scientific literature, are suitable. Review manuscripts addressing critical issues are encouraged. Preference is given to papers of clinical relevance to others in the field. All submitted manuscripts are subject to initial appraisal by the Editor-in-Chief, and, if found suitable for further considerations are peer reviewed by independent, anonymous expert referees. All peer review is single-blind and submission is online via ScholarOne Manuscripts.