Missing cell types in single-cell references impact deconvolution of bulk data but are detectable

IF 10.1 1区 生物学 Q1 BIOTECHNOLOGY & APPLIED MICROBIOLOGY
Adriana Ivich, Natalie R. Davidson, Laurie Grieshober, Weishan Li, Stephanie C. Hicks, Jennifer A. Doherty, Casey S. Greene
{"title":"Missing cell types in single-cell references impact deconvolution of bulk data but are detectable","authors":"Adriana Ivich, Natalie R. Davidson, Laurie Grieshober, Weishan Li, Stephanie C. Hicks, Jennifer A. Doherty, Casey S. Greene","doi":"10.1186/s13059-025-03506-9","DOIUrl":null,"url":null,"abstract":"Advancements in RNA sequencing have expanded our ability to study gene expression profiles of biological samples in bulk tissue and single cells. Deconvolution of bulk data with single-cell references provides the ability to study relative cell-type proportions, but most methods assume a reference is present for every cell type in bulk data. This is not true in all circumstances—cell types can be missing in single-cell profiles for many reasons. In this study, we examine the impact of missing cell types on deconvolution methods. Using paired single-cell and single-nucleus data, we simulate realistic scenarios where cell types are missing since single-nucleus RNA sequencing is able to capture cell types that would otherwise be missing in a single-cell counterpart. Single-nucleus sequencing captures cell types absent in single-cell counterparts, allowing us to study their effects on deconvolution. We evaluate three different methods and find that performance is influenced by both the number and similarity of missing cell types. Additionally, missing cell-type profiles can be recovered from residuals using a simple non-negative matrix factorization strategy. We also analyzed real bulk data of cancerous and non-cancerous samples. We observe results consistent with simulation, namely that expression patterns from cell types likely to be missing appear present in residuals. We expect our results to provide a starting point for those developing new deconvolution methods and help improve their to better account for the presence of missing cell types. Our results suggest that deconvolution methods should consider the possibility of missing cell types.","PeriodicalId":12611,"journal":{"name":"Genome Biology","volume":"25 1","pages":""},"PeriodicalIF":10.1000,"publicationDate":"2025-04-07","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Genome Biology","FirstCategoryId":"99","ListUrlMain":"https://doi.org/10.1186/s13059-025-03506-9","RegionNum":1,"RegionCategory":"生物学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q1","JCRName":"BIOTECHNOLOGY & APPLIED MICROBIOLOGY","Score":null,"Total":0}
引用次数: 0

Abstract

Advancements in RNA sequencing have expanded our ability to study gene expression profiles of biological samples in bulk tissue and single cells. Deconvolution of bulk data with single-cell references provides the ability to study relative cell-type proportions, but most methods assume a reference is present for every cell type in bulk data. This is not true in all circumstances—cell types can be missing in single-cell profiles for many reasons. In this study, we examine the impact of missing cell types on deconvolution methods. Using paired single-cell and single-nucleus data, we simulate realistic scenarios where cell types are missing since single-nucleus RNA sequencing is able to capture cell types that would otherwise be missing in a single-cell counterpart. Single-nucleus sequencing captures cell types absent in single-cell counterparts, allowing us to study their effects on deconvolution. We evaluate three different methods and find that performance is influenced by both the number and similarity of missing cell types. Additionally, missing cell-type profiles can be recovered from residuals using a simple non-negative matrix factorization strategy. We also analyzed real bulk data of cancerous and non-cancerous samples. We observe results consistent with simulation, namely that expression patterns from cell types likely to be missing appear present in residuals. We expect our results to provide a starting point for those developing new deconvolution methods and help improve their to better account for the presence of missing cell types. Our results suggest that deconvolution methods should consider the possibility of missing cell types.
单细胞参考中缺失的细胞类型影响大量数据的反褶积,但可以检测到
RNA测序技术的进步扩大了我们在组织和单细胞中研究生物样本基因表达谱的能力。使用单细胞参考的大数据反卷积提供了研究相对细胞类型比例的能力,但大多数方法假设大数据中的每个细胞类型都存在参考。这并不是在所有的情况下都是正确的——由于许多原因,单细胞概况中可能会缺少细胞类型。在这项研究中,我们研究了缺失的细胞类型对反卷积方法的影响。使用配对的单细胞和单核数据,我们模拟了细胞类型缺失的现实情况,因为单核RNA测序能够捕获单细胞对应物中缺失的细胞类型。单核测序捕获单细胞对应物中缺失的细胞类型,使我们能够研究它们对反褶积的影响。我们评估了三种不同的方法,发现性能受到缺失细胞类型的数量和相似性的影响。此外,缺失的细胞类型特征可以使用简单的非负矩阵分解策略从残差中恢复。我们还分析了癌变和非癌变样本的大量真实数据。我们观察到的结果与模拟一致,即来自可能缺失的细胞类型的表达模式出现在残差中。我们希望我们的结果为那些开发新的反卷积方法的人提供一个起点,并帮助改进他们更好地解释缺失细胞类型的存在。我们的结果表明,反卷积方法应该考虑缺失细胞类型的可能性。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
求助全文
约1分钟内获得全文 求助全文
来源期刊
Genome Biology
Genome Biology Biochemistry, Genetics and Molecular Biology-Genetics
CiteScore
21.00
自引率
3.30%
发文量
241
审稿时长
2 months
期刊介绍: Genome Biology stands as a premier platform for exceptional research across all domains of biology and biomedicine, explored through a genomic and post-genomic lens. With an impressive impact factor of 12.3 (2022),* the journal secures its position as the 3rd-ranked research journal in the Genetics and Heredity category and the 2nd-ranked research journal in the Biotechnology and Applied Microbiology category by Thomson Reuters. Notably, Genome Biology holds the distinction of being the highest-ranked open-access journal in this category. Our dedicated team of highly trained in-house Editors collaborates closely with our esteemed Editorial Board of international experts, ensuring the journal remains on the forefront of scientific advances and community standards. Regular engagement with researchers at conferences and institute visits underscores our commitment to staying abreast of the latest developments in the field.
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
确定
请完成安全验证×
copy
已复制链接
快去分享给好友吧!
我知道了
右上角分享
点击右上角分享
0
联系我们:info@booksci.cn Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。 Copyright © 2023 布克学术 All rights reserved.
京ICP备2023020795号-1
ghs 京公网安备 11010802042870号
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术官方微信