Alzheimer's disease detection using data fusion with a deep supervised encoder

Minh Trinh, Ryan Shahbaba, Craig Stark, Yueqi Ren
{"title":"Alzheimer's disease detection using data fusion with a deep supervised encoder","authors":"Minh Trinh, Ryan Shahbaba, Craig Stark, Yueqi Ren","doi":"10.3389/frdem.2024.1332928","DOIUrl":null,"url":null,"abstract":"Alzheimer's disease (AD) is affecting a growing number of individuals. As a result, there is a pressing need for accurate and early diagnosis methods. This study aims to achieve this goal by developing an optimal data analysis strategy to enhance computational diagnosis. Although various modalities of AD diagnostic data are collected, past research on computational methods of AD diagnosis has mainly focused on using single-modal inputs. We hypothesize that integrating, or “fusing,” various data modalities as inputs to prediction models could enhance diagnostic accuracy by offering a more comprehensive view of an individual's health profile. However, a potential challenge arises as this fusion of multiple modalities may result in significantly higher dimensional data. We hypothesize that employing suitable dimensionality reduction methods across heterogeneous modalities would not only help diagnosis models extract latent information but also enhance accuracy. Therefore, it is imperative to identify optimal strategies for both data fusion and dimensionality reduction. In this paper, we have conducted a comprehensive comparison of over 80 statistical machine learning methods, considering various classifiers, dimensionality reduction techniques, and data fusion strategies to assess our hypotheses. Specifically, we have explored three primary strategies: (1) Simple data fusion, which involves straightforward concatenation (fusion) of datasets before inputting them into a classifier; (2) Early data fusion, in which datasets are concatenated first, and then a dimensionality reduction technique is applied before feeding the resulting data into a classifier; and (3) Intermediate data fusion, in which dimensionality reduction methods are applied individually to each dataset before concatenating them to construct a classifier. For dimensionality reduction, we have explored several commonly-used techniques such as principal component analysis (PCA), autoencoder (AE), and LASSO. Additionally, we have implemented a new dimensionality-reduction method called the supervised encoder (SE), which involves slight modifications to standard deep neural networks. Our results show that SE substantially improves prediction accuracy compared to PCA, AE, and LASSO, especially in combination with intermediate fusion for multiclass diagnosis prediction.","PeriodicalId":408305,"journal":{"name":"Frontiers in Dementia","volume":"74 7","pages":""},"PeriodicalIF":0.0000,"publicationDate":"2024-02-12","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Frontiers in Dementia","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.3389/frdem.2024.1332928","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 0

Abstract

Alzheimer's disease (AD) is affecting a growing number of individuals. As a result, there is a pressing need for accurate and early diagnosis methods. This study aims to achieve this goal by developing an optimal data analysis strategy to enhance computational diagnosis. Although various modalities of AD diagnostic data are collected, past research on computational methods of AD diagnosis has mainly focused on using single-modal inputs. We hypothesize that integrating, or “fusing,” various data modalities as inputs to prediction models could enhance diagnostic accuracy by offering a more comprehensive view of an individual's health profile. However, a potential challenge arises as this fusion of multiple modalities may result in significantly higher dimensional data. We hypothesize that employing suitable dimensionality reduction methods across heterogeneous modalities would not only help diagnosis models extract latent information but also enhance accuracy. Therefore, it is imperative to identify optimal strategies for both data fusion and dimensionality reduction. In this paper, we have conducted a comprehensive comparison of over 80 statistical machine learning methods, considering various classifiers, dimensionality reduction techniques, and data fusion strategies to assess our hypotheses. Specifically, we have explored three primary strategies: (1) Simple data fusion, which involves straightforward concatenation (fusion) of datasets before inputting them into a classifier; (2) Early data fusion, in which datasets are concatenated first, and then a dimensionality reduction technique is applied before feeding the resulting data into a classifier; and (3) Intermediate data fusion, in which dimensionality reduction methods are applied individually to each dataset before concatenating them to construct a classifier. For dimensionality reduction, we have explored several commonly-used techniques such as principal component analysis (PCA), autoencoder (AE), and LASSO. Additionally, we have implemented a new dimensionality-reduction method called the supervised encoder (SE), which involves slight modifications to standard deep neural networks. Our results show that SE substantially improves prediction accuracy compared to PCA, AE, and LASSO, especially in combination with intermediate fusion for multiclass diagnosis prediction.
使用深度监督编码器融合数据检测阿尔茨海默病
阿尔茨海默病(AD)正在影响越来越多的人。因此,人们迫切需要准确的早期诊断方法。本研究旨在通过开发一种最佳数据分析策略来提高计算诊断能力,从而实现这一目标。虽然收集到了各种模式的注意力缺失症诊断数据,但以往关于注意力缺失症计算诊断方法的研究主要集中在使用单一模式的输入。我们假设,整合或 "融合 "各种数据模式作为预测模型的输入,可以更全面地了解个体的健康状况,从而提高诊断的准确性。然而,潜在的挑战也随之而来,因为多种模式的融合可能会导致数据维度大大提高。我们假设,在异构模式中采用合适的降维方法不仅有助于诊断模型提取潜在信息,还能提高准确性。因此,当务之急是确定数据融合和降维的最佳策略。在本文中,我们综合比较了 80 多种统计机器学习方法,考虑了各种分类器、降维技术和数据融合策略,以评估我们的假设。具体来说,我们探讨了三种主要策略:(1) 简单数据融合,即在将数据集输入分类器之前,直接将数据集合并(融合);(2) 早期数据融合,即首先合并数据集,然后在将所得数据输入分类器之前应用降维技术;(3) 中间数据融合,即在合并数据集以构建分类器之前,对每个数据集单独应用降维方法。在降维方面,我们探索了几种常用的技术,如主成分分析(PCA)、自动编码器(AE)和 LASSO。此外,我们还采用了一种名为监督编码器(SE)的新降维方法,该方法对标准深度神经网络稍作修改。我们的研究结果表明,与 PCA、AE 和 LASSO 相比,SE 大幅提高了预测准确率,尤其是在结合中间融合进行多类诊断预测时。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
求助全文
约1分钟内获得全文 求助全文
来源期刊
自引率
0.00%
发文量
0
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
确定
请完成安全验证×
copy
已复制链接
快去分享给好友吧!
我知道了
右上角分享
点击右上角分享
0
联系我们:info@booksci.cn Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。 Copyright © 2023 布克学术 All rights reserved.
京ICP备2023020795号-1
ghs 京公网安备 11010802042870号
Book学术文献互助
Book学术文献互助群
群 号:604180095
Book学术官方微信