scAI-SNP:一种从单细胞数据推断祖先的方法。

BMC methods Pub Date : 2025-01-01 Epub Date: 2025-05-19 DOI:10.1186/s44330-025-00029-4
Sung Chul Hong, Francesc Muyas, Isidro Cortés-Ciriano, Sahand Hormoz
{"title":"scAI-SNP:一种从单细胞数据推断祖先的方法。","authors":"Sung Chul Hong, Francesc Muyas, Isidro Cortés-Ciriano, Sahand Hormoz","doi":"10.1186/s44330-025-00029-4","DOIUrl":null,"url":null,"abstract":"<p><strong>Background: </strong>Collaborative efforts, such as the Human Cell Atlas, are rapidly accumulating large amounts of single-cell data. To ensure that single-cell atlases are representative of human genetic diversity, we need to determine the ancestry of the donors from whom single-cell data are generated. Self-reporting of race and ethnicity, although important, can be biased and is not always available for the datasets already collected.</p><p><strong>Methods: </strong>Here, we introduce scAI-SNP, a tool to infer ancestry directly from single-cell genomics data. To train scAI-SNP, we identified 4.5 million ancestry-informative single-nucleotide polymorphisms (SNPs) in the 1000 Genomes Project dataset across 3201 individuals from 26 population groups. For a query single-cell dataset, scAI-SNP uses these ancestry-informative SNPs to compute the contribution of each of the 26 population groups to the ancestry of the donor from whom the cells were obtained.</p><p><strong>Results: </strong>Using diverse single-cell datasets with matched whole-genome sequencing data, we show that scAI-SNP is robust to the sparsity of single-cell data, can accurately and consistently infer ancestry from samples derived from diverse types of tissues and cancer cells, and can be applied to different modalities of single-cell profiling assays, such as single-cell RNA-seq and single-cell ATAC-seq.</p><p><strong>Discussion: </strong>Finally, we argue that ensuring that single-cell atlases represent diverse ancestry, ideally alongside race and ethnicity, is ultimately important for improved and equitable health outcomes by accounting for human diversity.</p><p><strong>Supplementary information: </strong>The online version contains supplementary material available at 10.1186/s44330-025-00029-4.</p>","PeriodicalId":519945,"journal":{"name":"BMC methods","volume":"2 1","pages":"10"},"PeriodicalIF":0.0000,"publicationDate":"2025-01-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC12089154/pdf/","citationCount":"0","resultStr":"{\"title\":\"scAI-SNP: a method for inferring ancestry from single-cell data.\",\"authors\":\"Sung Chul Hong, Francesc Muyas, Isidro Cortés-Ciriano, Sahand Hormoz\",\"doi\":\"10.1186/s44330-025-00029-4\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"<p><strong>Background: </strong>Collaborative efforts, such as the Human Cell Atlas, are rapidly accumulating large amounts of single-cell data. To ensure that single-cell atlases are representative of human genetic diversity, we need to determine the ancestry of the donors from whom single-cell data are generated. Self-reporting of race and ethnicity, although important, can be biased and is not always available for the datasets already collected.</p><p><strong>Methods: </strong>Here, we introduce scAI-SNP, a tool to infer ancestry directly from single-cell genomics data. To train scAI-SNP, we identified 4.5 million ancestry-informative single-nucleotide polymorphisms (SNPs) in the 1000 Genomes Project dataset across 3201 individuals from 26 population groups. For a query single-cell dataset, scAI-SNP uses these ancestry-informative SNPs to compute the contribution of each of the 26 population groups to the ancestry of the donor from whom the cells were obtained.</p><p><strong>Results: </strong>Using diverse single-cell datasets with matched whole-genome sequencing data, we show that scAI-SNP is robust to the sparsity of single-cell data, can accurately and consistently infer ancestry from samples derived from diverse types of tissues and cancer cells, and can be applied to different modalities of single-cell profiling assays, such as single-cell RNA-seq and single-cell ATAC-seq.</p><p><strong>Discussion: </strong>Finally, we argue that ensuring that single-cell atlases represent diverse ancestry, ideally alongside race and ethnicity, is ultimately important for improved and equitable health outcomes by accounting for human diversity.</p><p><strong>Supplementary information: </strong>The online version contains supplementary material available at 10.1186/s44330-025-00029-4.</p>\",\"PeriodicalId\":519945,\"journal\":{\"name\":\"BMC methods\",\"volume\":\"2 1\",\"pages\":\"10\"},\"PeriodicalIF\":0.0000,\"publicationDate\":\"2025-01-01\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC12089154/pdf/\",\"citationCount\":\"0\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"BMC methods\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/10.1186/s44330-025-00029-4\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"2025/5/19 0:00:00\",\"PubModel\":\"Epub\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"BMC methods","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1186/s44330-025-00029-4","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"2025/5/19 0:00:00","PubModel":"Epub","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 0

摘要

背景:人类细胞图谱等合作项目正在迅速积累大量单细胞数据。为了确保单细胞图谱是人类遗传多样性的代表,我们需要确定产生单细胞数据的捐赠者的祖先。种族和民族的自我报告虽然很重要,但可能存在偏见,而且并不总是适用于已经收集的数据集。方法:在这里,我们引入scAI-SNP,一种直接从单细胞基因组学数据推断祖先的工具。为了训练scAI-SNP,我们在来自26个人群的3201个个体的1000基因组计划数据集中鉴定了450万个具有祖先信息的单核苷酸多态性(snp)。对于查询单细胞数据集,scAI-SNP使用这些祖先信息snp来计算26个人口群体中每个群体对获得细胞的供体祖先的贡献。结果:使用不同的单细胞数据集和匹配的全基因组测序数据,我们表明scAI-SNP对单细胞数据的稀疏性是稳健的,可以准确和一致地从来自不同类型组织和癌细胞的样本中推断祖先,并且可以应用于不同模式的单细胞分析分析,如单细胞RNA-seq和单细胞ATAC-seq。讨论:最后,我们认为,通过考虑人类多样性,确保单细胞图谱代表不同的祖先,理想情况下与种族和民族一起,对于改善和公平的健康结果最终是重要的。补充资料:在线版本包含补充资料,下载地址:10.1186/s44330-025-00029-4。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
scAI-SNP: a method for inferring ancestry from single-cell data.

Background: Collaborative efforts, such as the Human Cell Atlas, are rapidly accumulating large amounts of single-cell data. To ensure that single-cell atlases are representative of human genetic diversity, we need to determine the ancestry of the donors from whom single-cell data are generated. Self-reporting of race and ethnicity, although important, can be biased and is not always available for the datasets already collected.

Methods: Here, we introduce scAI-SNP, a tool to infer ancestry directly from single-cell genomics data. To train scAI-SNP, we identified 4.5 million ancestry-informative single-nucleotide polymorphisms (SNPs) in the 1000 Genomes Project dataset across 3201 individuals from 26 population groups. For a query single-cell dataset, scAI-SNP uses these ancestry-informative SNPs to compute the contribution of each of the 26 population groups to the ancestry of the donor from whom the cells were obtained.

Results: Using diverse single-cell datasets with matched whole-genome sequencing data, we show that scAI-SNP is robust to the sparsity of single-cell data, can accurately and consistently infer ancestry from samples derived from diverse types of tissues and cancer cells, and can be applied to different modalities of single-cell profiling assays, such as single-cell RNA-seq and single-cell ATAC-seq.

Discussion: Finally, we argue that ensuring that single-cell atlases represent diverse ancestry, ideally alongside race and ethnicity, is ultimately important for improved and equitable health outcomes by accounting for human diversity.

Supplementary information: The online version contains supplementary material available at 10.1186/s44330-025-00029-4.

求助全文
通过发布文献求助,成功后即可免费获取论文全文。 去求助
来源期刊
自引率
0.00%
发文量
0
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
确定
请完成安全验证×
copy
已复制链接
快去分享给好友吧!
我知道了
右上角分享
点击右上角分享
0
联系我们:info@booksci.cn Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。 Copyright © 2023 布克学术 All rights reserved.
京ICP备2023020795号-1
ghs 京公网安备 11010802042870号
Book学术文献互助
Book学术文献互助群
群 号:604180095
Book学术官方微信