Deep IDA: A Deep Learning Approach for Integrative Discriminant Analysis of Multi-omics Data with Feature Ranking- An Application to COVID-19

IF 4.6 Q2 MATERIALS SCIENCE, BIOMATERIALS

ACS Applied Bio Materials Pub Date : 2024-04-24 DOI:10.1093/bioadv/vbae060

Jiuzhou Wang, S. Safo

{"title":"Deep IDA: A Deep Learning Approach for Integrative Discriminant Analysis of Multi-omics Data with Feature Ranking- An Application to COVID-19","authors":"Jiuzhou Wang, S. Safo","doi":"10.1093/bioadv/vbae060","DOIUrl":null,"url":null,"abstract":"\n \n \n Many diseases are complex heterogeneous conditions that affect multiple organs in the body and depend on the interplay between several factors that include molecular and environmental factors, requiring a holistic approach to better understand disease pathobiology. Most existing methods for integrating data from multiple sources and classifying individuals into one of multiple classes or disease groups have mainly focused on linear relationships despite the complexity of these relationships. On the other hand, methods for nonlinear association and classification studies are limited in their ability to identify variables to aid in our understanding of the complexity of the disease or can be applied to only two data types.\n \n \n \n We propose Deep IDA (Integrative Discriminant Analysis), a deep learning method to learn complex nonlinear transformations of two or more views such that resulting projections have maximum association and maximum separation. Further, we propose a feature ranking approach based on ensemble learning for interpretatble results. We test Deep IDA on both simulated data and two large real-world datasets, including RNA sequencing, metabolomics, and proteomics data pertaining to COVID-19 severity. We identified signatures that better discriminated COVID-19 patient groups, and related to neurological conditions, cancer, and metabolic diseases, corroborating current research findings and heightening the need to study the post sequelae effects of COVID-19 to devise effective treatments and to improve patient care.\n \n \n \n Our algorithms are implemented in PyTorch and available at: https://github.com/JiuzhouW/DeepIDA.\n \n \n \n Supplementary materials are available at Bioinformatics Advances online\n","PeriodicalId":2,"journal":{"name":"ACS Applied Bio Materials","volume":"28 19","pages":""},"PeriodicalIF":4.6000,"publicationDate":"2024-04-24","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"ACS Applied Bio Materials","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1093/bioadv/vbae060","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q2","JCRName":"MATERIALS SCIENCE, BIOMATERIALS","Score":null,"Total":0}

引用次数: 0

Abstract

Many diseases are complex heterogeneous conditions that affect multiple organs in the body and depend on the interplay between several factors that include molecular and environmental factors, requiring a holistic approach to better understand disease pathobiology. Most existing methods for integrating data from multiple sources and classifying individuals into one of multiple classes or disease groups have mainly focused on linear relationships despite the complexity of these relationships. On the other hand, methods for nonlinear association and classification studies are limited in their ability to identify variables to aid in our understanding of the complexity of the disease or can be applied to only two data types. We propose Deep IDA (Integrative Discriminant Analysis), a deep learning method to learn complex nonlinear transformations of two or more views such that resulting projections have maximum association and maximum separation. Further, we propose a feature ranking approach based on ensemble learning for interpretatble results. We test Deep IDA on both simulated data and two large real-world datasets, including RNA sequencing, metabolomics, and proteomics data pertaining to COVID-19 severity. We identified signatures that better discriminated COVID-19 patient groups, and related to neurological conditions, cancer, and metabolic diseases, corroborating current research findings and heightening the need to study the post sequelae effects of COVID-19 to devise effective treatments and to improve patient care. Our algorithms are implemented in PyTorch and available at: https://github.com/JiuzhouW/DeepIDA. Supplementary materials are available at Bioinformatics Advances online

查看原文本刊更多论文

深度 IDA：利用特征排序对多组学数据进行综合判别分析的深度学习方法--在 COVID-19 中的应用

许多疾病都是影响体内多个器官的复杂异质病症，取决于包括分子和环境因素在内的多种因素之间的相互作用，因此需要采用整体方法来更好地了解疾病的病理生物学。尽管这些关系错综复杂，但大多数现有方法都主要关注线性关系，用于整合来自多个来源的数据，并将个体划分为多个类别或疾病组别之一。另一方面，用于非线性关联和分类研究的方法在识别变量以帮助我们理解疾病的复杂性方面能力有限，或者只能应用于两种数据类型。我们提出了深度 IDA（整合判别分析），这是一种深度学习方法，用于学习两个或多个视图的复杂非线性变换，从而使产生的投影具有最大关联性和最大分离性。此外，我们还提出了一种基于集合学习的特征排序方法，以获得可解释的结果。我们在模拟数据和两个大型真实数据集（包括与 COVID-19 严重程度相关的 RNA 测序、代谢组学和蛋白质组学数据）上测试了 Deep IDA。我们发现了能更好地区分 COVID-19 患者群体的特征，这些特征与神经系统疾病、癌症和代谢性疾病相关，证实了当前的研究成果，并提高了研究 COVID-19 后遗症影响的必要性，从而设计出有效的治疗方法并改善患者护理。我们的算法是在 PyTorch 中实现的，可在以下网址获取：https://github.com/JiuzhouW/DeepIDA。补充材料可在 Bioinformatics Advances 在线查阅。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

求助全文

约1分钟内获得全文求助全文

来源期刊

ACS Applied Bio Materials Chemistry-Chemistry (all)

CiteScore

9.40

自引率

2.10%

发文量

464

期刊介绍： ACS Applied Bio Materials is an interdisciplinary journal publishing original research covering all aspects of biomaterials and biointerfaces including and beyond the traditional biosensing, biomedical and therapeutic applications. The journal is devoted to reports of new and original experimental and theoretical research of an applied nature that integrates knowledge in the areas of materials, engineering, physics, bioscience, and chemistry into important bio applications. The journal is specifically interested in work that addresses the relationship between structure and function and assesses the stability and degradation of materials under relevant environmental and biological conditions.