Harmonization for Parkinson's Disease Multi-Dataset T1 MRI Morphometry Classification.

IF 1.6 Q3 CLINICAL NEUROLOGY
NeuroSci Pub Date : 2024-11-29 DOI:10.3390/neurosci5040042
Mohammed Saqib, Silvina G Horovitz
{"title":"Harmonization for Parkinson's Disease Multi-Dataset T1 MRI Morphometry Classification.","authors":"Mohammed Saqib, Silvina G Horovitz","doi":"10.3390/neurosci5040042","DOIUrl":null,"url":null,"abstract":"<p><p>Classification of disease and healthy volunteer cohorts provides a useful clinical alternative to traditional group statistics due to individualized, personalized predictions. Classifiers for neurodegenerative disease can be trained on structural MRI morphometry, but require large multi-scanner datasets, introducing confounding batch effects. We test ComBat, a common harmonization model, in an example application to classify subjects with Parkinson's disease from healthy volunteers and identify common pitfalls, including data leakage. We used a multi-dataset cohort of 372 subjects (216 with Parkinson's disease, 156 healthy volunteers) from 11 identified scanners. We extracted both FreeSurfer and the determinant of Jacobian morphometry to compare single-scanner and multi-scanner classification pipelines. We confirm the presence of batch effects by running single scanner classifiers which could achieve wildly divergent AUCs on scanner-specific datasets (mean:0.651 ± 0.144). Multi-scanner classifiers that considered neurobiological batch effects between sites could easily achieve a test AUC of 0.902, though pipelines that prevented data leakage could only achieve a test AUC of 0.550. We conclude that batch effects remain a major issue for classification problems, such that even impressive single-scanner classifiers are unlikely to generalize to multiple scanners, and that solving for batch effects in a classifier problem must avoid circularity and reporting overly optimistic results.</p>","PeriodicalId":74294,"journal":{"name":"NeuroSci","volume":"5 4","pages":"600-613"},"PeriodicalIF":1.6000,"publicationDate":"2024-11-29","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC11678312/pdf/","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"NeuroSci","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.3390/neurosci5040042","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q3","JCRName":"CLINICAL NEUROLOGY","Score":null,"Total":0}
引用次数: 0

Abstract

Classification of disease and healthy volunteer cohorts provides a useful clinical alternative to traditional group statistics due to individualized, personalized predictions. Classifiers for neurodegenerative disease can be trained on structural MRI morphometry, but require large multi-scanner datasets, introducing confounding batch effects. We test ComBat, a common harmonization model, in an example application to classify subjects with Parkinson's disease from healthy volunteers and identify common pitfalls, including data leakage. We used a multi-dataset cohort of 372 subjects (216 with Parkinson's disease, 156 healthy volunteers) from 11 identified scanners. We extracted both FreeSurfer and the determinant of Jacobian morphometry to compare single-scanner and multi-scanner classification pipelines. We confirm the presence of batch effects by running single scanner classifiers which could achieve wildly divergent AUCs on scanner-specific datasets (mean:0.651 ± 0.144). Multi-scanner classifiers that considered neurobiological batch effects between sites could easily achieve a test AUC of 0.902, though pipelines that prevented data leakage could only achieve a test AUC of 0.550. We conclude that batch effects remain a major issue for classification problems, such that even impressive single-scanner classifiers are unlikely to generalize to multiple scanners, and that solving for batch effects in a classifier problem must avoid circularity and reporting overly optimistic results.

帕金森病多数据集T1 MRI形态学分类的协调。
疾病和健康志愿者队列的分类提供了一个有用的临床替代传统的群体统计,由于个性化,个性化的预测。神经退行性疾病的分类器可以在结构MRI形态学上进行训练,但需要大量的多扫描仪数据集,引入混淆批效应。我们在一个示例应用程序中测试了ComBat,这是一种常见的协调模型,用于从健康志愿者中对帕金森病患者进行分类,并识别包括数据泄露在内的常见缺陷。我们使用了来自11个确定的扫描仪的372名受试者(216名帕金森病患者,156名健康志愿者)的多数据集队列。我们同时提取FreeSurfer和Jacobian morphometry的行列式来比较单扫描仪和多扫描仪分类管道。我们通过运行单个扫描仪分类器来确认批处理效应的存在,该分类器可以在扫描仪特定数据集上实现非常不同的auc(平均值:0.651±0.144)。考虑站点之间神经生物学批处理效应的多扫描仪分类器可以很容易地实现0.902的测试AUC,尽管防止数据泄漏的管道只能实现0.550的测试AUC。我们得出结论,批处理效果仍然是分类问题的主要问题,因此即使是令人印象深刻的单扫描仪分类器也不太可能推广到多个扫描仪,并且在分类器问题中解决批处理效果必须避免循环和报告过于乐观的结果。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
求助全文
约1分钟内获得全文 求助全文
来源期刊
自引率
0.00%
发文量
0
审稿时长
11 weeks
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
确定
请完成安全验证×
copy
已复制链接
快去分享给好友吧!
我知道了
右上角分享
点击右上角分享
0
联系我们:info@booksci.cn Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。 Copyright © 2023 布克学术 All rights reserved.
京ICP备2023020795号-1
ghs 京公网安备 11010802042870号
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术官方微信