Thomas M.H. Hope , Howard Bowman , Alex P. Leff , Cathy J. Price
{"title":"Deep convolutional neural networks outperform vanilla machine learning when predicting language outcomes after stroke","authors":"Thomas M.H. Hope , Howard Bowman , Alex P. Leff , Cathy J. Price","doi":"10.1016/j.nicl.2025.103880","DOIUrl":null,"url":null,"abstract":"<div><h3>Background</h3><div>Current medicine cannot confidently predict patients’ language skills after stroke. In recent years, researchers have sought to bridge this gap with machine learning. These models appear to benefit from access to features describing where and how much brain damage these patients have suffered. Given the very high dimensionality of structural brain imaging data, those brain lesion features are typically post-processed from the images themselves into tabular features. With the introduction of deep Convolutional Neural Networks (CNN), which appear to be much more robust to high dimensional data, it is natural to hope that much of this image post-processing might be unnecessary. But prior attempts to demonstrate this (in the area of post-stroke prognostics) have so far yielded only equivocal results – perhaps because the datasets that those studies could deploy were too small to properly constrain CNNs, which are famously ‘data-hungry’.</div></div><div><h3>Methods</h3><div>The study draws on a much larger dataset than has been employed in previous work like this, referring to patients whose language outcomes were assessed once during the chronic phase post-stroke, on or around the same days as they underwent high resolution MRI brain scans. Following the model of our own and others’ past work, we use state of the art ‘vanilla’ machine learning models (boosted ensembles) to predict a variety of language and cognitive outcomes scores. These models employ both demographic variables and features derived from the brain imaging data, which represent where brain damage has occurred. These are our baseline models. Next, we use deep CNNs to predict the same language scores for the same patients, drawing on both the demographic variables, and post-processed brain lesion images: i.e., multi-input models with one input for tabular features and another for 3-dimensional images. We compare the models using 5 × 2-fold cross-validation, with consistent folds.</div></div><div><h3>Results</h3><div>The CNN models consistently outperform the vanilla machine learning models, in this domain.</div></div><div><h3>Conclusions</h3><div>Deep CNNs offer state of the art performance when predicting language outcomes after stroke, outperforming vanilla machine learning and obviating the need to post-process lesion images into lesion features.</div></div>","PeriodicalId":54359,"journal":{"name":"Neuroimage-Clinical","volume":"48 ","pages":"Article 103880"},"PeriodicalIF":3.6000,"publicationDate":"2025-01-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Neuroimage-Clinical","FirstCategoryId":"3","ListUrlMain":"https://www.sciencedirect.com/science/article/pii/S2213158225001536","RegionNum":2,"RegionCategory":"医学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q2","JCRName":"NEUROIMAGING","Score":null,"Total":0}
引用次数: 0
Abstract
Background
Current medicine cannot confidently predict patients’ language skills after stroke. In recent years, researchers have sought to bridge this gap with machine learning. These models appear to benefit from access to features describing where and how much brain damage these patients have suffered. Given the very high dimensionality of structural brain imaging data, those brain lesion features are typically post-processed from the images themselves into tabular features. With the introduction of deep Convolutional Neural Networks (CNN), which appear to be much more robust to high dimensional data, it is natural to hope that much of this image post-processing might be unnecessary. But prior attempts to demonstrate this (in the area of post-stroke prognostics) have so far yielded only equivocal results – perhaps because the datasets that those studies could deploy were too small to properly constrain CNNs, which are famously ‘data-hungry’.
Methods
The study draws on a much larger dataset than has been employed in previous work like this, referring to patients whose language outcomes were assessed once during the chronic phase post-stroke, on or around the same days as they underwent high resolution MRI brain scans. Following the model of our own and others’ past work, we use state of the art ‘vanilla’ machine learning models (boosted ensembles) to predict a variety of language and cognitive outcomes scores. These models employ both demographic variables and features derived from the brain imaging data, which represent where brain damage has occurred. These are our baseline models. Next, we use deep CNNs to predict the same language scores for the same patients, drawing on both the demographic variables, and post-processed brain lesion images: i.e., multi-input models with one input for tabular features and another for 3-dimensional images. We compare the models using 5 × 2-fold cross-validation, with consistent folds.
Results
The CNN models consistently outperform the vanilla machine learning models, in this domain.
Conclusions
Deep CNNs offer state of the art performance when predicting language outcomes after stroke, outperforming vanilla machine learning and obviating the need to post-process lesion images into lesion features.
期刊介绍:
NeuroImage: Clinical, a journal of diseases, disorders and syndromes involving the Nervous System, provides a vehicle for communicating important advances in the study of abnormal structure-function relationships of the human nervous system based on imaging.
The focus of NeuroImage: Clinical is on defining changes to the brain associated with primary neurologic and psychiatric diseases and disorders of the nervous system as well as behavioral syndromes and developmental conditions. The main criterion for judging papers is the extent of scientific advancement in the understanding of the pathophysiologic mechanisms of diseases and disorders, in identification of functional models that link clinical signs and symptoms with brain function and in the creation of image based tools applicable to a broad range of clinical needs including diagnosis, monitoring and tracking of illness, predicting therapeutic response and development of new treatments. Papers dealing with structure and function in animal models will also be considered if they reveal mechanisms that can be readily translated to human conditions.