Thomas Barba, Bryce A Bagley, Sandra Steyaert, Francisco Carrillo-Perez, Christoph Sadée, Michael Iv, Olivier Gevaert
{"title":"DUNE: a versatile neuroimaging encoder captures brain complexity across 3 major diseases: cancer, dementia, and schizophrenia.","authors":"Thomas Barba, Bryce A Bagley, Sandra Steyaert, Francisco Carrillo-Perez, Christoph Sadée, Michael Iv, Olivier Gevaert","doi":"10.1093/gigascience/giaf116","DOIUrl":null,"url":null,"abstract":"<p><strong>Background: </strong>Magnetic resonance imaging (MRI) of the brain contains complex data that pose significant challenges for computational analysis. While models proposed for brain MRI analyses yield encouraging results, the high complexity of neuroimaging data hinders generalizability and clinical application. We introduce DUNE, a neuroimaging-oriented workflow that transforms raw brain MRI scans into standardized compact patient-level embeddings through integrated preprocessing and deep feature extraction, thereby enabling their processing by basic machine learning algorithms. A UNet-based autoencoder was trained using 3,814 selected scans of morphologically normal (healthy volunteers) or abnormal (glioma patients) brains, to generate comprehensive compact representations of the full-sized images. To evaluate their quality, these embeddings were utilized to train machine learning models to predict a wide range of clinical variables.</p><p><strong>Results: </strong>Embeddings were extracted for cohorts used for the model development (21,102 individuals), along with 3 additional independent cohorts (Alzheimer's disease, schizophrenia, and glioma cohorts, 1,322 individuals), to evaluate the model's generalization capabilities. The embeddings extracted from healthy volunteers' scans could predict a broad spectrum of clinical parameters, including volumetry metrics, cardiovascular disease (area under the receiver operating characteristic curve [AUROC] = 0.80) and alcohol consumption (AUROC = 0.99), and more nuanced parameters such as the Alzheimer's predisposing APOE4 allele (AUROC = 0.67). Embeddings derived from the validation cohorts successfully predicted the diagnoses of Alzheimer's dementia (AUROC = 0.92) and schizophrenia (AUROC = 0.64). Embeddings extracted from glioma scans successfully predicted survival (C-index = 0.608) and IDH molecular status (AUROC = 0.92), matching the performances of previous task-oriented models.</p><p><strong>Conclusion: </strong>DUNE efficiently represents clinically relevant patterns from full-size brain MRI scans across several disease areas, opening ways for innovative clinical applications in neurology.</p>","PeriodicalId":12581,"journal":{"name":"GigaScience","volume":"14 ","pages":""},"PeriodicalIF":11.8000,"publicationDate":"2025-01-06","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC12527335/pdf/","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"GigaScience","FirstCategoryId":"99","ListUrlMain":"https://doi.org/10.1093/gigascience/giaf116","RegionNum":2,"RegionCategory":"生物学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q1","JCRName":"MULTIDISCIPLINARY SCIENCES","Score":null,"Total":0}
引用次数: 0
Abstract
Background: Magnetic resonance imaging (MRI) of the brain contains complex data that pose significant challenges for computational analysis. While models proposed for brain MRI analyses yield encouraging results, the high complexity of neuroimaging data hinders generalizability and clinical application. We introduce DUNE, a neuroimaging-oriented workflow that transforms raw brain MRI scans into standardized compact patient-level embeddings through integrated preprocessing and deep feature extraction, thereby enabling their processing by basic machine learning algorithms. A UNet-based autoencoder was trained using 3,814 selected scans of morphologically normal (healthy volunteers) or abnormal (glioma patients) brains, to generate comprehensive compact representations of the full-sized images. To evaluate their quality, these embeddings were utilized to train machine learning models to predict a wide range of clinical variables.
Results: Embeddings were extracted for cohorts used for the model development (21,102 individuals), along with 3 additional independent cohorts (Alzheimer's disease, schizophrenia, and glioma cohorts, 1,322 individuals), to evaluate the model's generalization capabilities. The embeddings extracted from healthy volunteers' scans could predict a broad spectrum of clinical parameters, including volumetry metrics, cardiovascular disease (area under the receiver operating characteristic curve [AUROC] = 0.80) and alcohol consumption (AUROC = 0.99), and more nuanced parameters such as the Alzheimer's predisposing APOE4 allele (AUROC = 0.67). Embeddings derived from the validation cohorts successfully predicted the diagnoses of Alzheimer's dementia (AUROC = 0.92) and schizophrenia (AUROC = 0.64). Embeddings extracted from glioma scans successfully predicted survival (C-index = 0.608) and IDH molecular status (AUROC = 0.92), matching the performances of previous task-oriented models.
Conclusion: DUNE efficiently represents clinically relevant patterns from full-size brain MRI scans across several disease areas, opening ways for innovative clinical applications in neurology.
期刊介绍:
GigaScience seeks to transform data dissemination and utilization in the life and biomedical sciences. As an online open-access open-data journal, it specializes in publishing "big-data" studies encompassing various fields. Its scope includes not only "omic" type data and the fields of high-throughput biology currently serviced by large public repositories, but also the growing range of more difficult-to-access data, such as imaging, neuroscience, ecology, cohort data, systems biology and other new types of large-scale shareable data.