Predicting recovery following stroke: Deep learning, multimodal data and feature selection using explainable AI

IF 3.6 2区医学 Q2 NEUROIMAGING

Neuroimage-Clinical Pub Date : 2024-01-01 DOI:10.1016/j.nicl.2024.103638

Adam White , Margarita Saranti , Artur d’Avila Garcez , Thomas M.H. Hope , Cathy J. Price , Howard Bowman

{"title":"Predicting recovery following stroke: Deep learning, multimodal data and feature selection using explainable AI","authors":"Adam White , Margarita Saranti , Artur d’Avila Garcez , Thomas M.H. Hope , Cathy J. Price , Howard Bowman","doi":"10.1016/j.nicl.2024.103638","DOIUrl":null,"url":null,"abstract":"<div><p>Machine learning offers great potential for automated prediction of post-stroke symptoms and their response to rehabilitation. Major challenges for this endeavour include the very high dimensionality of neuroimaging data, the relatively small size of the datasets available for learning and interpreting the predictive features, as well as, how to effectively combine neuroimaging and tabular data (e.g. demographic information and clinical characteristics). This paper evaluates several solutions based on two strategies. The first is to use 2D images that summarise MRI scans. The second is to select key features that improve classification accuracy. Additionally, we introduce the novel approach of training a convolutional neural network (CNN) on images that combine regions-of-interests (ROIs) extracted from MRIs, with symbolic representations of tabular data.</p><p>We evaluate a series of CNN architectures (both 2D and a 3D) that are trained on different representations of MRI and tabular data, to predict whether a composite measure of post-stroke spoken picture description ability is in the aphasic or non-aphasic range. MRI and tabular data were acquired from 758 English speaking stroke survivors who participated in the PLORAS study. Each participant was assigned to one of five different groups that were matched for initial severity of symptoms, recovery time, left lesion size and the months or years post-stroke that spoken description scores were collected. Training and validation were carried out on the first four groups. The fifth (lock-box/test set) group was used to test how well model accuracy generalises to new (unseen) data.</p><p>The classification accuracy for a baseline logistic regression was 0.678 based on lesion size alone, rising to 0.757 and 0.813 when initial symptom severity and recovery time were successively added. The highest classification accuracy (0.854), area under the curve (0.899) and F1 score (0.901) were observed when 8 regions of interest were extracted from each MRI scan and combined with lesion size, initial severity and recovery time in a 2D Residual Neural Network (ResNet). This was also the best model when data were limited to the 286 participants with moderate or severe initial aphasia (with area under curve = 0.865), a group that would be considered more difficult to classify.</p><p>Our findings demonstrate how imaging and tabular data can be combined to achieve high post-stroke classification accuracy, even when the dataset is small in machine learning terms. We conclude by proposing how the current models could be improved to achieve even higher levels of accuracy using images from hospital scanners.</p></div>","PeriodicalId":54359,"journal":{"name":"Neuroimage-Clinical","volume":"43 ","pages":"Article 103638"},"PeriodicalIF":3.6000,"publicationDate":"2024-01-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.sciencedirect.com/science/article/pii/S2213158224000779/pdfft?md5=4ff6df56a0df81a94a8abd92e878c282&pid=1-s2.0-S2213158224000779-main.pdf","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Neuroimage-Clinical","FirstCategoryId":"3","ListUrlMain":"https://www.sciencedirect.com/science/article/pii/S2213158224000779","RegionNum":2,"RegionCategory":"医学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q2","JCRName":"NEUROIMAGING","Score":null,"Total":0}

引用次数: 0

Abstract

Machine learning offers great potential for automated prediction of post-stroke symptoms and their response to rehabilitation. Major challenges for this endeavour include the very high dimensionality of neuroimaging data, the relatively small size of the datasets available for learning and interpreting the predictive features, as well as, how to effectively combine neuroimaging and tabular data (e.g. demographic information and clinical characteristics). This paper evaluates several solutions based on two strategies. The first is to use 2D images that summarise MRI scans. The second is to select key features that improve classification accuracy. Additionally, we introduce the novel approach of training a convolutional neural network (CNN) on images that combine regions-of-interests (ROIs) extracted from MRIs, with symbolic representations of tabular data.

We evaluate a series of CNN architectures (both 2D and a 3D) that are trained on different representations of MRI and tabular data, to predict whether a composite measure of post-stroke spoken picture description ability is in the aphasic or non-aphasic range. MRI and tabular data were acquired from 758 English speaking stroke survivors who participated in the PLORAS study. Each participant was assigned to one of five different groups that were matched for initial severity of symptoms, recovery time, left lesion size and the months or years post-stroke that spoken description scores were collected. Training and validation were carried out on the first four groups. The fifth (lock-box/test set) group was used to test how well model accuracy generalises to new (unseen) data.

The classification accuracy for a baseline logistic regression was 0.678 based on lesion size alone, rising to 0.757 and 0.813 when initial symptom severity and recovery time were successively added. The highest classification accuracy (0.854), area under the curve (0.899) and F1 score (0.901) were observed when 8 regions of interest were extracted from each MRI scan and combined with lesion size, initial severity and recovery time in a 2D Residual Neural Network (ResNet). This was also the best model when data were limited to the 286 participants with moderate or severe initial aphasia (with area under curve = 0.865), a group that would be considered more difficult to classify.

Our findings demonstrate how imaging and tabular data can be combined to achieve high post-stroke classification accuracy, even when the dataset is small in machine learning terms. We conclude by proposing how the current models could be improved to achieve even higher levels of accuracy using images from hospital scanners.

查看原文本刊更多论文

预测中风后的恢复：使用可解释人工智能进行深度学习、多模态数据和特征选择。

机器学习为自动预测中风后症状及其对康复的反应提供了巨大的潜力。这项工作面临的主要挑战包括：神经成像数据的维度非常高，可用于学习和解释预测特征的数据集规模相对较小，以及如何有效地将神经成像和表格数据（如人口统计信息和临床特征）结合起来。本文评估了基于两种策略的几种解决方案。第一种是使用概括核磁共振成像扫描的二维图像。第二种是选择能提高分类准确性的关键特征。此外，我们还介绍了在图像上训练卷积神经网络（CNN）的新方法，该方法将从核磁共振成像中提取的感兴趣区（ROI）与表格数据的符号表示相结合。我们评估了在不同的核磁共振成像和表格数据表示上训练的一系列 CNN 架构（包括二维和三维架构），以预测中风后口语图片描述能力的综合指标是在失语范围内还是非失语范围内。核磁共振成像和表格数据来自 758 名参与 PLORAS 研究的英语中风幸存者。每个参与者都被分配到五个不同的组别中的一个，这些组别在最初症状严重程度、恢复时间、左侧病灶大小以及收集口语描述评分的卒中后几个月或几年等方面是匹配的。对前四组进行了训练和验证。第五组（锁定箱/测试集）用于测试模型的准确性对新数据（未见过的数据）的泛化程度。仅根据病变大小，基线逻辑回归的分类准确率为 0.678，当连续加入初始症状严重程度和恢复时间后，分类准确率分别上升到 0.757 和 0.813。在二维残差神经网络（ResNet）中，从每次磁共振成像扫描中提取 8 个感兴趣区并与病变大小、初始严重程度和恢复时间相结合，可观察到最高的分类准确率（0.854）、曲线下面积（0.899）和 F1 分数（0.901）。当数据仅限于中度或重度初始失语的 286 名参与者时，这也是最佳模型（曲线下面积 = 0.865），而这一群体被认为更难分类。我们的研究结果表明，即使数据集在机器学习方面很小，成像数据和表格数据也能结合起来，实现较高的卒中后分类准确率。最后，我们提出了如何利用医院扫描仪的图像改进现有模型，以达到更高的准确率。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

求助全文

约1分钟内获得全文求助全文

来源期刊

Neuroimage-Clinical NEUROIMAGING-

CiteScore

7.50

自引率

4.80%

发文量

368

审稿时长

52 days

期刊介绍： NeuroImage: Clinical, a journal of diseases, disorders and syndromes involving the Nervous System, provides a vehicle for communicating important advances in the study of abnormal structure-function relationships of the human nervous system based on imaging. The focus of NeuroImage: Clinical is on defining changes to the brain associated with primary neurologic and psychiatric diseases and disorders of the nervous system as well as behavioral syndromes and developmental conditions. The main criterion for judging papers is the extent of scientific advancement in the understanding of the pathophysiologic mechanisms of diseases and disorders, in identification of functional models that link clinical signs and symptoms with brain function and in the creation of image based tools applicable to a broad range of clinical needs including diagnosis, monitoring and tracking of illness, predicting therapeutic response and development of new treatments. Papers dealing with structure and function in animal models will also be considered if they reveal mechanisms that can be readily translated to human conditions.