可靠单细胞RNA-seq注释的保形推断。

IF 5.4
Marcos López-De-Castro, Alberto García-Galindo, José González-Gomariz, Rubén Armañanzas
{"title":"可靠单细胞RNA-seq注释的保形推断。","authors":"Marcos López-De-Castro, Alberto García-Galindo, José González-Gomariz, Rubén Armañanzas","doi":"10.1093/bioinformatics/btaf521","DOIUrl":null,"url":null,"abstract":"<p><strong>Motivation: </strong>Despite the inherent complexity associated to automatic cell type assignments, most supervised learning models overlook rigorous uncertainty quantification on the annotations. Although some existing pipelines incorporate rejection options under predefined circumstances, they usually rely on arbitrary assumptions and do not provide statistical guarantees. In this work, we propose a methodology based on the conformal prediction framework to provide reliable single-cell annotations. Conformal prediction provides statistical guarantees on the outcome predictions without making any assumption about the underlying distribution of the data. Our methodological proposal leverages conformal inference to address two critical challenges in single-cell RNA sequencing annotations: (i) detect out-of-distribution cell types in the query data; and, (ii) perform reliable uncertainty quantification of the cell annotations through well-calibrated prediction sets.</p><p><strong>Results: </strong>We evaluated the anomaly detector and the uncertainty-aware annotator in 10 batched experiments derived from various tissues. Specifically, we studied three different annotation taxonomies (standard, classwise, and cluster) alongside three different non-conformity measures. The results showed that our anomaly detector effectively identified previously unseen cell types, producing well-calibrated prediction sets. This rigorous annotation helped maintain coverage probabilities at the expected significance level. Finally, we illustrate how the integration of conformal prediction outputs enhanced further downstream analyses.</p><p><strong>Availability and implementation: </strong>The automatic scRNA-seq annotator is available at https://github.com/digital-medicine-research-group-UNAV/conformalized_single_cell_annotator and https://doi.org/10.5281/zenodo.15870599.</p>","PeriodicalId":93899,"journal":{"name":"Bioinformatics (Oxford, England)","volume":" ","pages":""},"PeriodicalIF":5.4000,"publicationDate":"2025-10-02","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC12506889/pdf/","citationCount":"0","resultStr":"{\"title\":\"Conformal inference for reliable single cell RNA-seq annotation.\",\"authors\":\"Marcos López-De-Castro, Alberto García-Galindo, José González-Gomariz, Rubén Armañanzas\",\"doi\":\"10.1093/bioinformatics/btaf521\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"<p><strong>Motivation: </strong>Despite the inherent complexity associated to automatic cell type assignments, most supervised learning models overlook rigorous uncertainty quantification on the annotations. Although some existing pipelines incorporate rejection options under predefined circumstances, they usually rely on arbitrary assumptions and do not provide statistical guarantees. In this work, we propose a methodology based on the conformal prediction framework to provide reliable single-cell annotations. Conformal prediction provides statistical guarantees on the outcome predictions without making any assumption about the underlying distribution of the data. Our methodological proposal leverages conformal inference to address two critical challenges in single-cell RNA sequencing annotations: (i) detect out-of-distribution cell types in the query data; and, (ii) perform reliable uncertainty quantification of the cell annotations through well-calibrated prediction sets.</p><p><strong>Results: </strong>We evaluated the anomaly detector and the uncertainty-aware annotator in 10 batched experiments derived from various tissues. Specifically, we studied three different annotation taxonomies (standard, classwise, and cluster) alongside three different non-conformity measures. The results showed that our anomaly detector effectively identified previously unseen cell types, producing well-calibrated prediction sets. This rigorous annotation helped maintain coverage probabilities at the expected significance level. Finally, we illustrate how the integration of conformal prediction outputs enhanced further downstream analyses.</p><p><strong>Availability and implementation: </strong>The automatic scRNA-seq annotator is available at https://github.com/digital-medicine-research-group-UNAV/conformalized_single_cell_annotator and https://doi.org/10.5281/zenodo.15870599.</p>\",\"PeriodicalId\":93899,\"journal\":{\"name\":\"Bioinformatics (Oxford, England)\",\"volume\":\" \",\"pages\":\"\"},\"PeriodicalIF\":5.4000,\"publicationDate\":\"2025-10-02\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC12506889/pdf/\",\"citationCount\":\"0\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"Bioinformatics (Oxford, England)\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/10.1093/bioinformatics/btaf521\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"Bioinformatics (Oxford, England)","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1093/bioinformatics/btaf521","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 0

摘要

动机:尽管与自动细胞类型分配相关的固有复杂性,大多数监督学习模型忽略了对注释的严格不确定性量化。尽管一些现有管道在预先确定的情况下包含拒绝选项,但它们通常依赖于任意假设,不提供统计保证。在这项工作中,我们提出了一种基于保形预测框架的方法来提供可靠的单细胞注释。适形预测为结果预测提供了统计保证,而无需对数据的潜在分布进行任何假设。我们的方法建议利用保形推理来解决单细胞RNA测序注释中的两个关键挑战:(i)在查询数据中检测分布外的细胞类型;并且,(ii)通过校准良好的预测集对细胞注释进行可靠的不确定性量化。结果:我们在10个来自不同组织的批量实验中评估了异常检测器和不确定性感知注释器。具体来说,我们研究了三种不同的注释分类法(标准、分类和聚类)以及三种不同的不一致性度量。结果表明,我们的异常检测器有效地识别了以前未见过的细胞类型,产生了校准良好的预测集。这种严格的注释有助于将覆盖概率维持在预期的显著性水平上。最后,我们说明了保形预测输出的集成如何增强进一步的下游分析。可用性:自动scRNA-seq注释器可在https://github.com/digital-medicine-research-group-UNAV/conformalized_single_cell_annotator和https://doi.org/10.5281/zenodo.15870599.Supplementary上获得信息:补充数据可在Bioinformatics在线获得。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
Conformal inference for reliable single cell RNA-seq annotation.

Motivation: Despite the inherent complexity associated to automatic cell type assignments, most supervised learning models overlook rigorous uncertainty quantification on the annotations. Although some existing pipelines incorporate rejection options under predefined circumstances, they usually rely on arbitrary assumptions and do not provide statistical guarantees. In this work, we propose a methodology based on the conformal prediction framework to provide reliable single-cell annotations. Conformal prediction provides statistical guarantees on the outcome predictions without making any assumption about the underlying distribution of the data. Our methodological proposal leverages conformal inference to address two critical challenges in single-cell RNA sequencing annotations: (i) detect out-of-distribution cell types in the query data; and, (ii) perform reliable uncertainty quantification of the cell annotations through well-calibrated prediction sets.

Results: We evaluated the anomaly detector and the uncertainty-aware annotator in 10 batched experiments derived from various tissues. Specifically, we studied three different annotation taxonomies (standard, classwise, and cluster) alongside three different non-conformity measures. The results showed that our anomaly detector effectively identified previously unseen cell types, producing well-calibrated prediction sets. This rigorous annotation helped maintain coverage probabilities at the expected significance level. Finally, we illustrate how the integration of conformal prediction outputs enhanced further downstream analyses.

Availability and implementation: The automatic scRNA-seq annotator is available at https://github.com/digital-medicine-research-group-UNAV/conformalized_single_cell_annotator and https://doi.org/10.5281/zenodo.15870599.

求助全文
通过发布文献求助,成功后即可免费获取论文全文。 去求助
来源期刊
自引率
0.00%
发文量
0
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
确定
请完成安全验证×
copy
已复制链接
快去分享给好友吧!
我知道了
右上角分享
点击右上角分享
0
联系我们:info@booksci.cn Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。 Copyright © 2023 布克学术 All rights reserved.
京ICP备2023020795号-1
ghs 京公网安备 11010802042870号
Book学术文献互助
Book学术文献互助群
群 号:604180095
Book学术官方微信