{"title":"在 ABIDE 数据集上比较和解释预测自闭症的机器学习分类器的框架","authors":"Yilan Dong, Dafnis Batalle, Maria Deprez","doi":"10.1002/hbm.70190","DOIUrl":null,"url":null,"abstract":"<p>Autism is a neurodevelopmental condition affecting ~1% of the population. Recently, machine learning models have been trained to classify participants with autism using their neuroimaging features, though the performance of these models varies in the literature. Differences in experimental setup hamper the direct comparison of different machine-learning approaches. In this paper, five of the most widely used and best-performing machine learning models in the field were trained to classify participants with autism and typically developing (TD) participants, using functional connectivity matrices, structural volumetric measures, and phenotypic information from the Autism Brain Imaging Data Exchange (ABIDE) dataset. Their performance was compared under the same evaluation standard. The models implemented included: graph convolutional networks (GCN), edge-variational graph convolutional networks (EV-GCN), fully connected networks (FCN), autoencoder followed by a fully connected network (AE-FCN) and support vector machine (SVM). Our results show that all models performed similarly, achieving a classification accuracy around 70%. Our results suggest that different inclusion criteria, data modalities, and evaluation pipelines rather than different machine learning models may explain variations in accuracy in the published literature. The highest accuracy in our framework was obtained when using ensemble models (<i>p</i> < 0.001), leading to an accuracy of 72.2% and AUC = 0.77 using GCN classifiers. However, an SVM classifier performed with an accuracy of 70.1% and AUC = 0.77, just marginally below GCN, and significant differences were not found when comparing different algorithms under the same testing conditions (<i>p</i> > 0.05). Furthermore, we also investigated the stability of features identified by the different machine learning models using the SmoothGrad interpretation method. The FCN model demonstrated the highest stability in selecting relevant features contributing to model decision making. The code is available at https://github.com/YilanDong19/Machine-learning-with-ABIDE.</p>","PeriodicalId":13019,"journal":{"name":"Human Brain Mapping","volume":"46 5","pages":""},"PeriodicalIF":3.5000,"publicationDate":"2025-03-17","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://onlinelibrary.wiley.com/doi/epdf/10.1002/hbm.70190","citationCount":"0","resultStr":"{\"title\":\"A Framework for Comparison and Interpretation of Machine Learning Classifiers to Predict Autism on the ABIDE Dataset\",\"authors\":\"Yilan Dong, Dafnis Batalle, Maria Deprez\",\"doi\":\"10.1002/hbm.70190\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"<p>Autism is a neurodevelopmental condition affecting ~1% of the population. Recently, machine learning models have been trained to classify participants with autism using their neuroimaging features, though the performance of these models varies in the literature. Differences in experimental setup hamper the direct comparison of different machine-learning approaches. In this paper, five of the most widely used and best-performing machine learning models in the field were trained to classify participants with autism and typically developing (TD) participants, using functional connectivity matrices, structural volumetric measures, and phenotypic information from the Autism Brain Imaging Data Exchange (ABIDE) dataset. Their performance was compared under the same evaluation standard. The models implemented included: graph convolutional networks (GCN), edge-variational graph convolutional networks (EV-GCN), fully connected networks (FCN), autoencoder followed by a fully connected network (AE-FCN) and support vector machine (SVM). Our results show that all models performed similarly, achieving a classification accuracy around 70%. Our results suggest that different inclusion criteria, data modalities, and evaluation pipelines rather than different machine learning models may explain variations in accuracy in the published literature. The highest accuracy in our framework was obtained when using ensemble models (<i>p</i> < 0.001), leading to an accuracy of 72.2% and AUC = 0.77 using GCN classifiers. However, an SVM classifier performed with an accuracy of 70.1% and AUC = 0.77, just marginally below GCN, and significant differences were not found when comparing different algorithms under the same testing conditions (<i>p</i> > 0.05). Furthermore, we also investigated the stability of features identified by the different machine learning models using the SmoothGrad interpretation method. The FCN model demonstrated the highest stability in selecting relevant features contributing to model decision making. The code is available at https://github.com/YilanDong19/Machine-learning-with-ABIDE.</p>\",\"PeriodicalId\":13019,\"journal\":{\"name\":\"Human Brain Mapping\",\"volume\":\"46 5\",\"pages\":\"\"},\"PeriodicalIF\":3.5000,\"publicationDate\":\"2025-03-17\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"https://onlinelibrary.wiley.com/doi/epdf/10.1002/hbm.70190\",\"citationCount\":\"0\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"Human Brain Mapping\",\"FirstCategoryId\":\"3\",\"ListUrlMain\":\"https://onlinelibrary.wiley.com/doi/10.1002/hbm.70190\",\"RegionNum\":2,\"RegionCategory\":\"医学\",\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"Q1\",\"JCRName\":\"NEUROIMAGING\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"Human Brain Mapping","FirstCategoryId":"3","ListUrlMain":"https://onlinelibrary.wiley.com/doi/10.1002/hbm.70190","RegionNum":2,"RegionCategory":"医学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q1","JCRName":"NEUROIMAGING","Score":null,"Total":0}
A Framework for Comparison and Interpretation of Machine Learning Classifiers to Predict Autism on the ABIDE Dataset
Autism is a neurodevelopmental condition affecting ~1% of the population. Recently, machine learning models have been trained to classify participants with autism using their neuroimaging features, though the performance of these models varies in the literature. Differences in experimental setup hamper the direct comparison of different machine-learning approaches. In this paper, five of the most widely used and best-performing machine learning models in the field were trained to classify participants with autism and typically developing (TD) participants, using functional connectivity matrices, structural volumetric measures, and phenotypic information from the Autism Brain Imaging Data Exchange (ABIDE) dataset. Their performance was compared under the same evaluation standard. The models implemented included: graph convolutional networks (GCN), edge-variational graph convolutional networks (EV-GCN), fully connected networks (FCN), autoencoder followed by a fully connected network (AE-FCN) and support vector machine (SVM). Our results show that all models performed similarly, achieving a classification accuracy around 70%. Our results suggest that different inclusion criteria, data modalities, and evaluation pipelines rather than different machine learning models may explain variations in accuracy in the published literature. The highest accuracy in our framework was obtained when using ensemble models (p < 0.001), leading to an accuracy of 72.2% and AUC = 0.77 using GCN classifiers. However, an SVM classifier performed with an accuracy of 70.1% and AUC = 0.77, just marginally below GCN, and significant differences were not found when comparing different algorithms under the same testing conditions (p > 0.05). Furthermore, we also investigated the stability of features identified by the different machine learning models using the SmoothGrad interpretation method. The FCN model demonstrated the highest stability in selecting relevant features contributing to model decision making. The code is available at https://github.com/YilanDong19/Machine-learning-with-ABIDE.
期刊介绍:
Human Brain Mapping publishes peer-reviewed basic, clinical, technical, and theoretical research in the interdisciplinary and rapidly expanding field of human brain mapping. The journal features research derived from non-invasive brain imaging modalities used to explore the spatial and temporal organization of the neural systems supporting human behavior. Imaging modalities of interest include positron emission tomography, event-related potentials, electro-and magnetoencephalography, magnetic resonance imaging, and single-photon emission tomography. Brain mapping research in both normal and clinical populations is encouraged.
Article formats include Research Articles, Review Articles, Clinical Case Studies, and Technique, as well as Technological Developments, Theoretical Articles, and Synthetic Reviews. Technical advances, such as novel brain imaging methods, analyses for detecting or localizing neural activity, synergistic uses of multiple imaging modalities, and strategies for the design of behavioral paradigms and neural-systems modeling are of particular interest. The journal endorses the propagation of methodological standards and encourages database development in the field of human brain mapping.