BuDDI: Bulk Deconvolution with Domain Invariance to predict cell-type-specific perturbations from bulk.

IF 3.8 2区生物学 Q1 BIOCHEMICAL RESEARCH METHODS

PLoS Computational Biology Pub Date : 2025-01-17 eCollection Date: 2025-01-01 DOI:10.1371/journal.pcbi.1012742

Natalie R Davidson, Fan Zhang, Casey S Greene

{"title":"BuDDI: Bulk Deconvolution with Domain Invariance to predict cell-type-specific perturbations from bulk.","authors":"Natalie R Davidson, Fan Zhang, Casey S Greene","doi":"10.1371/journal.pcbi.1012742","DOIUrl":null,"url":null,"abstract":"<p><p>While single-cell experiments provide deep cellular resolution within a single sample, some single-cell experiments are inherently more challenging than bulk experiments due to dissociation difficulties, cost, or limited tissue availability. This creates a situation where we have deep cellular profiles of one sample or condition, and bulk profiles across multiple samples and conditions. To bridge this gap, we propose BuDDI (BUlk Deconvolution with Domain Invariance). BuDDI utilizes domain adaptation techniques to effectively integrate available corpora of case-control bulk and reference scRNA-seq observations to infer cell-type-specific perturbation effects. BuDDI achieves this by learning independent latent spaces within a single variational autoencoder (VAE) encompassing at least four sources of variability: 1) cell type proportion, 2) perturbation effect, 3) structured experimental variability, and 4) remaining variability. Since each latent space is encouraged to be independent, we simulate perturbation responses by independently composing each latent space to simulate cell-type-specific perturbation responses. We evaluated BuDDI's performance on simulated and real data with experimental designs of increasing complexity. We first validated that BuDDI could learn domain invariant latent spaces on data with matched samples across each source of variability. Then we validated that BuDDI could accurately predict cell-type-specific perturbation response when no single-cell perturbed profiles were used during training; instead, only bulk samples had both perturbed and non-perturbed observations. Finally, we validated BuDDI on predicting sex-specific differences, an experimental design where it is not possible to have matched samples. In each experiment, BuDDI outperformed all other comparative methods and baselines. As more reference atlases are completed, BuDDI provides a path to combine these resources with bulk-profiled treatment or disease signatures to study perturbations, sex differences, or other factors at single-cell resolution.</p>","PeriodicalId":20241,"journal":{"name":"PLoS Computational Biology","volume":"21 1","pages":"e1012742"},"PeriodicalIF":3.8000,"publicationDate":"2025-01-17","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC11790236/pdf/","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"PLoS Computational Biology","FirstCategoryId":"99","ListUrlMain":"https://doi.org/10.1371/journal.pcbi.1012742","RegionNum":2,"RegionCategory":"生物学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"2025/1/1 0:00:00","PubModel":"eCollection","JCR":"Q1","JCRName":"BIOCHEMICAL RESEARCH METHODS","Score":null,"Total":0}

引用次数: 0

Abstract

While single-cell experiments provide deep cellular resolution within a single sample, some single-cell experiments are inherently more challenging than bulk experiments due to dissociation difficulties, cost, or limited tissue availability. This creates a situation where we have deep cellular profiles of one sample or condition, and bulk profiles across multiple samples and conditions. To bridge this gap, we propose BuDDI (BUlk Deconvolution with Domain Invariance). BuDDI utilizes domain adaptation techniques to effectively integrate available corpora of case-control bulk and reference scRNA-seq observations to infer cell-type-specific perturbation effects. BuDDI achieves this by learning independent latent spaces within a single variational autoencoder (VAE) encompassing at least four sources of variability: 1) cell type proportion, 2) perturbation effect, 3) structured experimental variability, and 4) remaining variability. Since each latent space is encouraged to be independent, we simulate perturbation responses by independently composing each latent space to simulate cell-type-specific perturbation responses. We evaluated BuDDI's performance on simulated and real data with experimental designs of increasing complexity. We first validated that BuDDI could learn domain invariant latent spaces on data with matched samples across each source of variability. Then we validated that BuDDI could accurately predict cell-type-specific perturbation response when no single-cell perturbed profiles were used during training; instead, only bulk samples had both perturbed and non-perturbed observations. Finally, we validated BuDDI on predicting sex-specific differences, an experimental design where it is not possible to have matched samples. In each experiment, BuDDI outperformed all other comparative methods and baselines. As more reference atlases are completed, BuDDI provides a path to combine these resources with bulk-profiled treatment or disease signatures to study perturbations, sex differences, or other factors at single-cell resolution.

查看原文本刊更多论文

基于域不变性的体反褶积预测来自体的细胞类型特异性扰动。

虽然单细胞实验在单个样品中提供深度细胞分辨率，但由于解离困难，成本或组织可用性有限，一些单细胞实验本质上比批量实验更具挑战性。这创造了一种情况，即我们有一个样本或条件的深度细胞概况，以及多个样本和条件的批量概况。为了弥补这一差距，我们提出了BuDDI （BUlk Deconvolution with Domain Invariance）。BuDDI利用结构域自适应技术有效地整合了病例对照体和参考scRNA-seq观察的可用数据库，以推断细胞类型特异性的扰动效应。BuDDI通过学习单个变分自编码器（VAE）中的独立潜在空间来实现这一目标，该空间至少包含四个可变性来源：1)细胞类型比例，2)扰动效应，3)结构化实验可变性，以及4)剩余可变性。由于鼓励每个潜在空间是独立的，我们通过独立组合每个潜在空间来模拟细胞类型特异性的扰动响应来模拟扰动响应。我们用越来越复杂的实验设计来评估BuDDI在模拟和真实数据上的性能。我们首先验证了BuDDI可以在每个变异性源上具有匹配样本的数据上学习域不变潜在空间。然后我们验证了BuDDI在训练过程中没有使用单细胞扰动谱时可以准确预测细胞类型特异性扰动响应；相反，只有大样本同时具有扰动和非扰动观测值。最后，我们验证BuDDI预测性别特异性差异的能力，这是一种不可能匹配样本的实验设计。在每个实验中，BuDDI都优于所有其他比较方法和基线。随着更多参考地图集的完成，BuDDI提供了一条途径，将这些资源与大规模治疗或疾病特征结合起来，以单细胞分辨率研究扰动、性别差异或其他因素。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

求助全文

约1分钟内获得全文求助全文

来源期刊

PLoS Computational Biology BIOCHEMICAL RESEARCH METHODS-MATHEMATICAL & COMPUTATIONAL BIOLOGY

CiteScore

7.10

自引率

4.70%

发文量

820

审稿时长

2.5 months

期刊介绍： PLOS Computational Biology features works of exceptional significance that further our understanding of living systems at all scales—from molecules and cells, to patient populations and ecosystems—through the application of computational methods. Readers include life and computational scientists, who can take the important findings presented here to the next level of discovery. Research articles must be declared as belonging to a relevant section. More information about the sections can be found in the submission guidelines. Research articles should model aspects of biological systems, demonstrate both methodological and scientific novelty, and provide profound new biological insights. Generally, reliability and significance of biological discovery through computation should be validated and enriched by experimental studies. Inclusion of experimental validation is not required for publication, but should be referenced where possible. Inclusion of experimental validation of a modest biological discovery through computation does not render a manuscript suitable for PLOS Computational Biology. Research articles specifically designated as Methods papers should describe outstanding methods of exceptional importance that have been shown, or have the promise to provide new biological insights. The method must already be widely adopted, or have the promise of wide adoption by a broad community of users. Enhancements to existing published methods will only be considered if those enhancements bring exceptional new capabilities.