Salvatore Milite, Giulio Caravagna, Andrea Sottoriva
{"title":"MIDAA: deep archetypal analysis for interpretable multi-omic data integration based on biological principles","authors":"Salvatore Milite, Giulio Caravagna, Andrea Sottoriva","doi":"10.1186/s13059-025-03530-9","DOIUrl":null,"url":null,"abstract":"High-throughput multi-omic molecular profiling allows the probing of biological systems at unprecedented resolution. However, integrating and interpreting high-dimensional, sparse, and noisy multimodal datasets remains challenging. Deriving new biological insights with current methods is difficult because they are not rooted in biological principles but prioritise tasks like dimensionality reduction. Here, we introduce a framework that combines archetypal analysis, an approach grounded in biological principles, with deep learning. Using archetypes based on evolutionary trade-offs and Pareto optimality, MIDAA finds extreme data points that define the geometry of the latent space, preserving the complexity of biological interactions while retaining an interpretable output. We demonstrate that these extreme points represent cellular programmes reflecting the underlying biology. Moreover, we show that, compared to alternative methods, MIDAA can identify parsimonious, interpretable, and biologically relevant patterns from real and simulated multi-omics.","PeriodicalId":12611,"journal":{"name":"Genome Biology","volume":"59 1","pages":""},"PeriodicalIF":10.1000,"publicationDate":"2025-04-08","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Genome Biology","FirstCategoryId":"99","ListUrlMain":"https://doi.org/10.1186/s13059-025-03530-9","RegionNum":1,"RegionCategory":"生物学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q1","JCRName":"BIOTECHNOLOGY & APPLIED MICROBIOLOGY","Score":null,"Total":0}
引用次数: 0
Abstract
High-throughput multi-omic molecular profiling allows the probing of biological systems at unprecedented resolution. However, integrating and interpreting high-dimensional, sparse, and noisy multimodal datasets remains challenging. Deriving new biological insights with current methods is difficult because they are not rooted in biological principles but prioritise tasks like dimensionality reduction. Here, we introduce a framework that combines archetypal analysis, an approach grounded in biological principles, with deep learning. Using archetypes based on evolutionary trade-offs and Pareto optimality, MIDAA finds extreme data points that define the geometry of the latent space, preserving the complexity of biological interactions while retaining an interpretable output. We demonstrate that these extreme points represent cellular programmes reflecting the underlying biology. Moreover, we show that, compared to alternative methods, MIDAA can identify parsimonious, interpretable, and biologically relevant patterns from real and simulated multi-omics.
Genome BiologyBiochemistry, Genetics and Molecular Biology-Genetics
CiteScore
21.00
自引率
3.30%
发文量
241
审稿时长
2 months
期刊介绍:
Genome Biology stands as a premier platform for exceptional research across all domains of biology and biomedicine, explored through a genomic and post-genomic lens.
With an impressive impact factor of 12.3 (2022),* the journal secures its position as the 3rd-ranked research journal in the Genetics and Heredity category and the 2nd-ranked research journal in the Biotechnology and Applied Microbiology category by Thomson Reuters. Notably, Genome Biology holds the distinction of being the highest-ranked open-access journal in this category.
Our dedicated team of highly trained in-house Editors collaborates closely with our esteemed Editorial Board of international experts, ensuring the journal remains on the forefront of scientific advances and community standards. Regular engagement with researchers at conferences and institute visits underscores our commitment to staying abreast of the latest developments in the field.