Deep learning-based Alzheimer's disease detection: reproducibility and the effect of modeling choices.

IF 2.1 4区医学 Q2 MATHEMATICAL & COMPUTATIONAL BIOLOGY

Frontiers in Computational Neuroscience Pub Date : 2024-09-20 eCollection Date: 2024-01-01 DOI:10.3389/fncom.2024.1360095

Rosanna Turrisi, Alessandro Verri, Annalisa Barla

{"title":"Deep learning-based Alzheimer's disease detection: reproducibility and the effect of modeling choices.","authors":"Rosanna Turrisi, Alessandro Verri, Annalisa Barla","doi":"10.3389/fncom.2024.1360095","DOIUrl":null,"url":null,"abstract":"Introduction: Machine Learning (ML) has emerged as a promising approach in healthcare, outperforming traditional statistical techniques. However, to establish ML as a reliable tool in clinical practice, adherence to best practices in data handling, and modeling design and assessment is crucial. In this work, we summarize and strictly adhere to such practices to ensure reproducible and reliable ML. Specifically, we focus on Alzheimer's Disease (AD) detection, a challenging problem in healthcare. Additionally, we investigate the impact of modeling choices, including different data augmentation techniques and model complexity, on overall performance.Methods: We utilize Magnetic Resonance Imaging (MRI) data from the ADNI corpus to address a binary classification problem using 3D Convolutional Neural Networks (CNNs). Data processing and modeling are specifically tailored to address data scarcity and minimize computational overhead. Within this framework, we train 15 predictive models, considering three different data augmentation strategies and five distinct 3D CNN architectures with varying convolutional layers counts. The augmentation strategies involve affine transformations, such as zoom, shift, and rotation, applied either concurrently or separately.Results: The combined effect of data augmentation and model complexity results in up to 10% variation in prediction accuracy. Notably, when affine transformation are applied separately, the model achieves higher accuracy, regardless the chosen architecture. Across all strategies, the model accuracy exhibits a concave behavior as the number of convolutional layers increases, peaking at an intermediate value. The best model reaches excellent performance both on the internal and additional external testing set.Discussions: Our work underscores the critical importance of adhering to rigorous experimental practices in the field of ML applied to healthcare. The results clearly demonstrate how data augmentation and model depth-often overlooked factors- can dramatically impact final performance if not thoroughly investigated. This highlights both the necessity of exploring neglected modeling aspects and the need to comprehensively report all modeling choices to ensure reproducibility and facilitate meaningful comparisons across studies.","PeriodicalId":12363,"journal":{"name":"Frontiers in Computational Neuroscience","volume":"18 ","pages":"1360095"},"PeriodicalIF":2.1000,"publicationDate":"2024-09-20","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC11451303/pdf/","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Frontiers in Computational Neuroscience","FirstCategoryId":"3","ListUrlMain":"https://doi.org/10.3389/fncom.2024.1360095","RegionNum":4,"RegionCategory":"医学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"2024/1/1 0:00:00","PubModel":"eCollection","JCR":"Q2","JCRName":"MATHEMATICAL & COMPUTATIONAL BIOLOGY","Score":null,"Total":0}

引用次数: 0

Abstract

Introduction: Machine Learning (ML) has emerged as a promising approach in healthcare, outperforming traditional statistical techniques. However, to establish ML as a reliable tool in clinical practice, adherence to best practices in data handling, and modeling design and assessment is crucial. In this work, we summarize and strictly adhere to such practices to ensure reproducible and reliable ML. Specifically, we focus on Alzheimer's Disease (AD) detection, a challenging problem in healthcare. Additionally, we investigate the impact of modeling choices, including different data augmentation techniques and model complexity, on overall performance.

Methods: We utilize Magnetic Resonance Imaging (MRI) data from the ADNI corpus to address a binary classification problem using 3D Convolutional Neural Networks (CNNs). Data processing and modeling are specifically tailored to address data scarcity and minimize computational overhead. Within this framework, we train 15 predictive models, considering three different data augmentation strategies and five distinct 3D CNN architectures with varying convolutional layers counts. The augmentation strategies involve affine transformations, such as zoom, shift, and rotation, applied either concurrently or separately.

Results: The combined effect of data augmentation and model complexity results in up to 10% variation in prediction accuracy. Notably, when affine transformation are applied separately, the model achieves higher accuracy, regardless the chosen architecture. Across all strategies, the model accuracy exhibits a concave behavior as the number of convolutional layers increases, peaking at an intermediate value. The best model reaches excellent performance both on the internal and additional external testing set.

Discussions: Our work underscores the critical importance of adhering to rigorous experimental practices in the field of ML applied to healthcare. The results clearly demonstrate how data augmentation and model depth-often overlooked factors- can dramatically impact final performance if not thoroughly investigated. This highlights both the necessity of exploring neglected modeling aspects and the need to comprehensively report all modeling choices to ensure reproducibility and facilitate meaningful comparisons across studies.

查看原文本刊更多论文

基于深度学习的阿尔茨海默病检测：可重复性和建模选择的影响。

简介：机器学习（ML）已成为医疗保健领域一种前景广阔的方法，其性能优于传统的统计技术。然而，要将机器学习作为临床实践中的可靠工具，遵守数据处理、建模设计和评估方面的最佳实践至关重要。在这项工作中，我们总结并严格遵守这些做法，以确保 ML 的可重复性和可靠性。具体来说，我们将重点放在阿尔茨海默病（AD）的检测上，这是医疗保健领域的一个挑战性问题。此外，我们还研究了建模选择（包括不同的数据增强技术和模型复杂性）对总体性能的影响：我们利用 ADNI 语料库中的磁共振成像（MRI）数据，使用三维卷积神经网络（CNN）解决二元分类问题。数据处理和建模是专门为解决数据稀缺和最大限度减少计算开销而定制的。在此框架内，我们训练了 15 个预测模型，考虑了三种不同的数据增强策略和五种具有不同卷积层数的三维卷积神经网络架构。增强策略涉及仿射变换，如缩放、移位和旋转，可同时或单独应用：结果：数据增强和模型复杂性的综合影响导致预测准确率的变化高达 10%。值得注意的是，当仿射变换单独应用时，无论选择何种架构，模型都能达到更高的准确度。在所有策略中，随着卷积层数的增加，模型的准确性呈现出凹凸行为，并在中间值达到峰值。最佳模型在内部测试集和额外的外部测试集上都达到了极佳的性能：我们的工作强调了在应用于医疗保健的人工智能领域坚持严格实验实践的重要性。研究结果清楚地表明了数据扩充和模型深度--这些经常被忽视的因素--如果不进行深入研究，会如何极大地影响最终性能。这既强调了探索被忽视的建模方面的必要性，也强调了全面报告所有建模选择的必要性，以确保可重复性并促进不同研究之间进行有意义的比较。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

求助全文

约1分钟内获得全文求助全文

来源期刊

Frontiers in Computational Neuroscience MATHEMATICAL & COMPUTATIONAL BIOLOGY-NEUROSCIENCES

CiteScore

5.30

自引率

3.10%

发文量

166

审稿时长

6-12 weeks

期刊介绍： Frontiers in Computational Neuroscience is a first-tier electronic journal devoted to promoting theoretical modeling of brain function and fostering interdisciplinary interactions between theoretical and experimental neuroscience. Progress in understanding the amazing capabilities of the brain is still limited, and we believe that it will only come with deep theoretical thinking and mutually stimulating cooperation between different disciplines and approaches. We therefore invite original contributions on a wide range of topics that present the fruits of such cooperation, or provide stimuli for future alliances. We aim to provide an interactive forum for cutting-edge theoretical studies of the nervous system, and for promulgating the best theoretical research to the broader neuroscience community. Models of all styles and at all levels are welcome, from biophysically motivated realistic simulations of neurons and synapses to high-level abstract models of inference and decision making. While the journal is primarily focused on theoretically based and driven research, we welcome experimental studies that validate and test theoretical conclusions. Also: comp neuro