Challenges in AI-driven multi-omics data analysis for Oncology: Addressing dimensionality, sparsity, transparency and ethical considerations

Q1 Medicine
Maryem Ouhmouk , Shakuntala Baichoo , Mounia Abik
{"title":"Challenges in AI-driven multi-omics data analysis for Oncology: Addressing dimensionality, sparsity, transparency and ethical considerations","authors":"Maryem Ouhmouk ,&nbsp;Shakuntala Baichoo ,&nbsp;Mounia Abik","doi":"10.1016/j.imu.2025.101679","DOIUrl":null,"url":null,"abstract":"<div><div>Artificial intelligence, particularly deep learning, is becoming increasingly prominent in multi-omics research, especially since traditional statistical models struggle to handle the complexity and high dimensionality of such data. By effectively combining different types of omics data, AI techniques can unveil hidden connections, detect biomarkers, and improve disease prediction through the integration of multi-omics layers and modalities, which can lead to significant advancements in precision medicine. In this review, we gathered published methods of deep learning-based multi-omics integration specialized in oncology since 2020. We concentrated exclusively on studies utilizing cancer omics data mainly sourced from The Cancer Genome Atlas (TCGA) database. As a result, we identified 32 articles that generally fulfilled the criteria. We studied their techniques and their ability to handle challenges in analyzing multi-omics data, particularly regarding missing data, dimensionality, and processing workflows. We also discuss how well these methods consider explainability, interpretability, and ethical aspects in developing solutions that treat private medical and sensitive information.</div><div>From the 32 studies, we can divide deep learning-based multi-omics integration methods into two types: non-generative and generative models. Non-generative approaches, such as feedforward neural networks (FFNs), graph convolutional networks (GCNs), and autoencoders, are designed to extract features and perform classification directly. On the other hand, generative methods such as variational autoencoders (VAEs), generative adversarial networks (GANs), and generative pretrained transformers (GPTs) focus on creating adaptable representations that can be shared across multiple modalities. These methods have advanced the handling of missing data and dimensionality, outperforming traditional approaches. However, most reviewed models remain at the proof-of-concept stage, with limited clinical validation or real-world deployment.</div></div>","PeriodicalId":13953,"journal":{"name":"Informatics in Medicine Unlocked","volume":"57 ","pages":"Article 101679"},"PeriodicalIF":0.0000,"publicationDate":"2025-01-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Informatics in Medicine Unlocked","FirstCategoryId":"1085","ListUrlMain":"https://www.sciencedirect.com/science/article/pii/S2352914825000681","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q1","JCRName":"Medicine","Score":null,"Total":0}
引用次数: 0

Abstract

Artificial intelligence, particularly deep learning, is becoming increasingly prominent in multi-omics research, especially since traditional statistical models struggle to handle the complexity and high dimensionality of such data. By effectively combining different types of omics data, AI techniques can unveil hidden connections, detect biomarkers, and improve disease prediction through the integration of multi-omics layers and modalities, which can lead to significant advancements in precision medicine. In this review, we gathered published methods of deep learning-based multi-omics integration specialized in oncology since 2020. We concentrated exclusively on studies utilizing cancer omics data mainly sourced from The Cancer Genome Atlas (TCGA) database. As a result, we identified 32 articles that generally fulfilled the criteria. We studied their techniques and their ability to handle challenges in analyzing multi-omics data, particularly regarding missing data, dimensionality, and processing workflows. We also discuss how well these methods consider explainability, interpretability, and ethical aspects in developing solutions that treat private medical and sensitive information.
From the 32 studies, we can divide deep learning-based multi-omics integration methods into two types: non-generative and generative models. Non-generative approaches, such as feedforward neural networks (FFNs), graph convolutional networks (GCNs), and autoencoders, are designed to extract features and perform classification directly. On the other hand, generative methods such as variational autoencoders (VAEs), generative adversarial networks (GANs), and generative pretrained transformers (GPTs) focus on creating adaptable representations that can be shared across multiple modalities. These methods have advanced the handling of missing data and dimensionality, outperforming traditional approaches. However, most reviewed models remain at the proof-of-concept stage, with limited clinical validation or real-world deployment.
人工智能驱动的肿瘤学多组学数据分析的挑战:解决维度、稀疏性、透明度和伦理考虑
人工智能,特别是深度学习,在多组学研究中变得越来越突出,特别是因为传统的统计模型难以处理此类数据的复杂性和高维性。通过有效结合不同类型的组学数据,人工智能技术可以通过多组学层和模式的整合,揭示隐藏的联系,检测生物标志物,改善疾病预测,这可能会导致精准医疗的重大进步。在这篇综述中,我们收集了自2020年以来发表的基于深度学习的肿瘤学多组学整合方法。我们专注于利用主要来自癌症基因组图谱(TCGA)数据库的癌症组学数据的研究。结果,我们确定了32篇基本符合标准的文章。我们研究了他们的技术和他们处理多组学数据分析挑战的能力,特别是在缺失数据、维度和处理工作流方面。我们还讨论了这些方法在开发处理私人医疗和敏感信息的解决方案时如何很好地考虑可解释性、可解释性和伦理方面。从这32项研究中,我们可以将基于深度学习的多组学集成方法分为非生成模型和生成模型两类。非生成方法,如前馈神经网络(ffn)、图卷积网络(GCNs)和自动编码器,被设计用于提取特征并直接执行分类。另一方面,生成方法,如变分自编码器(VAEs)、生成对抗网络(GANs)和生成预训练变压器(GPTs)专注于创建可跨多种模式共享的自适应表示。这些方法提高了对缺失数据和维度的处理,优于传统方法。然而,大多数被审查的模型仍处于概念验证阶段,缺乏临床验证或实际应用。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
求助全文
约1分钟内获得全文 求助全文
来源期刊
Informatics in Medicine Unlocked
Informatics in Medicine Unlocked Medicine-Health Informatics
CiteScore
9.50
自引率
0.00%
发文量
282
审稿时长
39 days
期刊介绍: Informatics in Medicine Unlocked (IMU) is an international gold open access journal covering a broad spectrum of topics within medical informatics, including (but not limited to) papers focusing on imaging, pathology, teledermatology, public health, ophthalmological, nursing and translational medicine informatics. The full papers that are published in the journal are accessible to all who visit the website.
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
确定
请完成安全验证×
copy
已复制链接
快去分享给好友吧!
我知道了
右上角分享
点击右上角分享
0
联系我们:info@booksci.cn Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。 Copyright © 2023 布克学术 All rights reserved.
京ICP备2023020795号-1
ghs 京公网安备 11010802042870号
Book学术文献互助
Book学术文献互助群
群 号:604180095
Book学术官方微信