LEOPARD: missing view completion for multi-timepoint omics data via representation disentanglement and temporal knowledge transfer

IF 15.7 1区综合性期刊 Q1 MULTIDISCIPLINARY SCIENCES

Nature Communications Pub Date : 2025-04-06 DOI:10.1038/s41467-025-58314-3

Siyu Han, Shixiang Yu, Mengya Shi, Makoto Harada, Jianhong Ge, Jiesheng Lin, Cornelia Prehn, Agnese Petrera, Ying Li, Flora Sam, Giuseppe Matullo, Jerzy Adamski, Karsten Suhre, Christian Gieger, Stefanie M. Hauck, Christian Herder, Michael Roden, Francesco Paolo Casale, Na Cai, Annette Peters, Rui Wang-Sattler

{"title":"LEOPARD: missing view completion for multi-timepoint omics data via representation disentanglement and temporal knowledge transfer","authors":"Siyu Han, Shixiang Yu, Mengya Shi, Makoto Harada, Jianhong Ge, Jiesheng Lin, Cornelia Prehn, Agnese Petrera, Ying Li, Flora Sam, Giuseppe Matullo, Jerzy Adamski, Karsten Suhre, Christian Gieger, Stefanie M. Hauck, Christian Herder, Michael Roden, Francesco Paolo Casale, Na Cai, Annette Peters, Rui Wang-Sattler","doi":"10.1038/s41467-025-58314-3","DOIUrl":null,"url":null,"abstract":"<p>Longitudinal multi-view omics data offer unique insights into the temporal dynamics of individual-level physiology, which provides opportunities to advance personalized healthcare. However, the common occurrence of incomplete views makes extrapolation tasks difficult, and there is a lack of tailored methods for this critical issue. Here, we introduce LEOPARD, an innovative approach specifically designed to complete missing views in multi-timepoint omics data. By disentangling longitudinal omics data into content and temporal representations, LEOPARD transfers the temporal knowledge to the omics-specific content, thereby completing missing views. The effectiveness of LEOPARD is validated on four real-world omics datasets constructed with data from the MGH COVID study and the KORA cohort, spanning periods from 3 days to 14 years. Compared to conventional imputation methods, such as missForest, PMM, GLMM, and cGAN, LEOPARD yields the most robust results across the benchmark datasets. LEOPARD-imputed data also achieve the highest agreement with observed data in our analyses for age-associated metabolites detection, estimated glomerular filtration rate-associated proteins identification, and chronic kidney disease prediction. Our work takes the first step toward a generalized treatment of missing views in longitudinal omics data, enabling comprehensive exploration of temporal dynamics and providing valuable insights into personalized healthcare.</p>","PeriodicalId":19066,"journal":{"name":"Nature Communications","volume":"217 1","pages":""},"PeriodicalIF":15.7000,"publicationDate":"2025-04-06","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Nature Communications","FirstCategoryId":"103","ListUrlMain":"https://doi.org/10.1038/s41467-025-58314-3","RegionNum":1,"RegionCategory":"综合性期刊","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q1","JCRName":"MULTIDISCIPLINARY SCIENCES","Score":null,"Total":0}

引用次数: 0

Abstract

Longitudinal multi-view omics data offer unique insights into the temporal dynamics of individual-level physiology, which provides opportunities to advance personalized healthcare. However, the common occurrence of incomplete views makes extrapolation tasks difficult, and there is a lack of tailored methods for this critical issue. Here, we introduce LEOPARD, an innovative approach specifically designed to complete missing views in multi-timepoint omics data. By disentangling longitudinal omics data into content and temporal representations, LEOPARD transfers the temporal knowledge to the omics-specific content, thereby completing missing views. The effectiveness of LEOPARD is validated on four real-world omics datasets constructed with data from the MGH COVID study and the KORA cohort, spanning periods from 3 days to 14 years. Compared to conventional imputation methods, such as missForest, PMM, GLMM, and cGAN, LEOPARD yields the most robust results across the benchmark datasets. LEOPARD-imputed data also achieve the highest agreement with observed data in our analyses for age-associated metabolites detection, estimated glomerular filtration rate-associated proteins identification, and chronic kidney disease prediction. Our work takes the first step toward a generalized treatment of missing views in longitudinal omics data, enabling comprehensive exploration of temporal dynamics and providing valuable insights into personalized healthcare.

Abstract Image

查看原文本刊更多论文

LEOPARD：基于表示解纠缠和时间知识转移的多时间点组学数据缺失视图补全

纵向多视图组学数据提供了独特的见解到个人水平生理的时间动态，这为推进个性化医疗保健提供了机会。然而，不完整视图的常见出现使得外推任务变得困难，并且缺乏针对这一关键问题的定制方法。在这里，我们介绍了LEOPARD，一种专门用于完成多时间点组学数据中缺失视图的创新方法。通过将纵向组学数据分解为内容和时间表示，LEOPARD将时间知识转移到组学特定的内容，从而完成缺失的视图。LEOPARD的有效性在四个真实世界的组学数据集上得到验证，这些数据集是由MGH COVID研究和KORA队列的数据构建的，时间跨度从3天到14年不等。与传统的imputation方法（如missForest、PMM、GLMM和cGAN）相比，LEOPARD在基准数据集上产生了最鲁棒的结果。在我们对年龄相关代谢物检测、肾小球滤过率相关蛋白鉴定和慢性肾脏疾病预测的分析中，leopard估算的数据与观察到的数据也达到了最高的一致性。我们的工作向纵向组学数据中缺失视图的通用治疗迈出了第一步，使全面探索时间动态并为个性化医疗保健提供有价值的见解。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

求助全文

约1分钟内获得全文求助全文

来源期刊

Nature Communications Biological Science Disciplines-

CiteScore

24.90

自引率

2.40%

发文量

6928

审稿时长

3.7 months

期刊介绍： Nature Communications, an open-access journal, publishes high-quality research spanning all areas of the natural sciences. Papers featured in the journal showcase significant advances relevant to specialists in each respective field. With a 2-year impact factor of 16.6 (2022) and a median time of 8 days from submission to the first editorial decision, Nature Communications is committed to rapid dissemination of research findings. As a multidisciplinary journal, it welcomes contributions from biological, health, physical, chemical, Earth, social, mathematical, applied, and engineering sciences, aiming to highlight important breakthroughs within each domain.