Multi-Source Learning for Joint Analysis of Incomplete Multi-Modality Neuroimaging Data.

Lei Yuan, Yalin Wang, Paul M Thompson, Vaibhav A Narayan, Jieping Ye
{"title":"Multi-Source Learning for Joint Analysis of Incomplete Multi-Modality Neuroimaging Data.","authors":"Lei Yuan,&nbsp;Yalin Wang,&nbsp;Paul M Thompson,&nbsp;Vaibhav A Narayan,&nbsp;Jieping Ye","doi":"10.1145/2339530.2339710","DOIUrl":null,"url":null,"abstract":"<p><p>Incomplete data present serious problems when integrating largescale brain imaging data sets from different imaging modalities. In the Alzheimer's Disease Neuroimaging Initiative (ADNI), for example, over half of the subjects lack cerebrospinal fluid (CSF) measurements; an independent half of the subjects do not have fluorodeoxyglucose positron emission tomography (FDG-PET) scans; many lack proteomics measurements. Traditionally, subjects with missing measures are discarded, resulting in a severe loss of available information. We address this problem by proposing two novel learning methods where all the samples (with at least one available data source) can be used. In the first method, we divide our samples according to the availability of data sources, and we learn shared sets of features with state-of-the-art sparse learning methods. Our second method learns a base classifier for each data source independently, based on which we represent each source using a single column of prediction scores; we then estimate the missing prediction scores, which, combined with the existing prediction scores, are used to build a multi-source fusion model. To illustrate the proposed approaches, we classify patients from the ADNI study into groups with Alzheimer's disease (AD), mild cognitive impairment (MCI) and normal controls, based on the multi-modality data. At baseline, ADNI's 780 participants (172 AD, 397 MCI, 211 Normal), have at least one of four data types: magnetic resonance imaging (MRI), FDG-PET, CSF and proteomics. These data are used to test our algorithms. Comprehensive experiments show that our proposed methods yield stable and promising results.</p>","PeriodicalId":74037,"journal":{"name":"KDD : proceedings. International Conference on Knowledge Discovery & Data Mining","volume":null,"pages":null},"PeriodicalIF":0.0000,"publicationDate":"2012-01-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://sci-hub-pdf.com/10.1145/2339530.2339710","citationCount":"46","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"KDD : proceedings. International Conference on Knowledge Discovery & Data Mining","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1145/2339530.2339710","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 46

Abstract

Incomplete data present serious problems when integrating largescale brain imaging data sets from different imaging modalities. In the Alzheimer's Disease Neuroimaging Initiative (ADNI), for example, over half of the subjects lack cerebrospinal fluid (CSF) measurements; an independent half of the subjects do not have fluorodeoxyglucose positron emission tomography (FDG-PET) scans; many lack proteomics measurements. Traditionally, subjects with missing measures are discarded, resulting in a severe loss of available information. We address this problem by proposing two novel learning methods where all the samples (with at least one available data source) can be used. In the first method, we divide our samples according to the availability of data sources, and we learn shared sets of features with state-of-the-art sparse learning methods. Our second method learns a base classifier for each data source independently, based on which we represent each source using a single column of prediction scores; we then estimate the missing prediction scores, which, combined with the existing prediction scores, are used to build a multi-source fusion model. To illustrate the proposed approaches, we classify patients from the ADNI study into groups with Alzheimer's disease (AD), mild cognitive impairment (MCI) and normal controls, based on the multi-modality data. At baseline, ADNI's 780 participants (172 AD, 397 MCI, 211 Normal), have at least one of four data types: magnetic resonance imaging (MRI), FDG-PET, CSF and proteomics. These data are used to test our algorithms. Comprehensive experiments show that our proposed methods yield stable and promising results.

不完整多模态神经影像数据联合分析的多源学习。
当整合来自不同成像方式的大规模脑成像数据集时,数据不完整会带来严重的问题。例如,在阿尔茨海默病神经影像学倡议(ADNI)中,超过一半的受试者缺乏脑脊液(CSF)测量;独立的一半受试者没有氟脱氧葡萄糖正电子发射断层扫描(FDG-PET);许多缺乏蛋白质组学测量。传统上,缺少测量的受试者被丢弃,导致可用信息的严重丢失。我们通过提出两种新颖的学习方法来解决这个问题,其中所有的样本(至少有一个可用的数据源)都可以使用。在第一种方法中,我们根据数据源的可用性划分样本,并使用最先进的稀疏学习方法学习共享的特征集。我们的第二种方法是为每个数据源独立学习一个基本分类器,在此基础上,我们使用单个预测分数列表示每个数据源;然后对缺失的预测分数进行估计,并结合已有的预测分数构建多源融合模型。为了说明所提出的方法,我们根据多模态数据将ADNI研究中的患者分为阿尔茨海默病(AD)、轻度认知障碍(MCI)和正常对照组。在基线时,ADNI的780名参与者(172名AD, 397名MCI, 211名正常)至少有四种数据类型中的一种:磁共振成像(MRI), FDG-PET, CSF和蛋白质组学。这些数据用来测试我们的算法。综合实验表明,我们提出的方法产生了稳定和有希望的结果。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
求助全文
约1分钟内获得全文 求助全文
来源期刊
自引率
0.00%
发文量
0
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
确定
请完成安全验证×
copy
已复制链接
快去分享给好友吧!
我知道了
右上角分享
点击右上角分享
0
联系我们:info@booksci.cn Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。 Copyright © 2023 布克学术 All rights reserved.
京ICP备2023020795号-1
ghs 京公网安备 11010802042870号
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术官方微信