Yicheng Gao, Kejing Dong, Caihua Shan, Dongsheng Li, Qi Liu
{"title":"Causal disentanglement for single-cell representations and controllable counterfactual generation","authors":"Yicheng Gao, Kejing Dong, Caihua Shan, Dongsheng Li, Qi Liu","doi":"10.1038/s41467-025-62008-1","DOIUrl":null,"url":null,"abstract":"<p>Conducting disentanglement learning on single-cell omics data offers a promising alternative to traditional black-box representation learning by separating the semantic concepts embedded in a biological process. We present CausCell, which incorporates the factual information about causal relationships among disentangled concepts within a diffusion model to generate more reliable disentangled cellular representations, with the aim of increasing the explainability, generalizability and controllability of single-cell data, including spatial-temporal omics data, relative to those of the existing black-box representation learning models. Two quantitative evaluation scenarios, i.e., disentanglement and reconstruction, are presented to conduct the first comprehensive single-cell disentanglement learning benchmark, which demonstrates that CausCell outperforms the state-of-the-art methods in both scenarios. Additionally, CausCell can implement controllable generation by intervening with the concepts of single-cell data when given a causal structure. It also has the potential to uncover biological insights by generating counterfactuals from small and noisy single-cell datasets.</p>","PeriodicalId":19066,"journal":{"name":"Nature Communications","volume":"106 1","pages":""},"PeriodicalIF":15.7000,"publicationDate":"2025-07-23","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Nature Communications","FirstCategoryId":"103","ListUrlMain":"https://doi.org/10.1038/s41467-025-62008-1","RegionNum":1,"RegionCategory":"综合性期刊","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q1","JCRName":"MULTIDISCIPLINARY SCIENCES","Score":null,"Total":0}
引用次数: 0
Abstract
Conducting disentanglement learning on single-cell omics data offers a promising alternative to traditional black-box representation learning by separating the semantic concepts embedded in a biological process. We present CausCell, which incorporates the factual information about causal relationships among disentangled concepts within a diffusion model to generate more reliable disentangled cellular representations, with the aim of increasing the explainability, generalizability and controllability of single-cell data, including spatial-temporal omics data, relative to those of the existing black-box representation learning models. Two quantitative evaluation scenarios, i.e., disentanglement and reconstruction, are presented to conduct the first comprehensive single-cell disentanglement learning benchmark, which demonstrates that CausCell outperforms the state-of-the-art methods in both scenarios. Additionally, CausCell can implement controllable generation by intervening with the concepts of single-cell data when given a causal structure. It also has the potential to uncover biological insights by generating counterfactuals from small and noisy single-cell datasets.
期刊介绍:
Nature Communications, an open-access journal, publishes high-quality research spanning all areas of the natural sciences. Papers featured in the journal showcase significant advances relevant to specialists in each respective field. With a 2-year impact factor of 16.6 (2022) and a median time of 8 days from submission to the first editorial decision, Nature Communications is committed to rapid dissemination of research findings. As a multidisciplinary journal, it welcomes contributions from biological, health, physical, chemical, Earth, social, mathematical, applied, and engineering sciences, aiming to highlight important breakthroughs within each domain.