A Refined Analysis of Neanderthal-Introgressed Sequences in Modern Humans with a Complete Reference Genome

Shen-Ao Liang, Tianxin Ren, Jiayu Zhang, Jiahui He, Xuankai Wang, Xinrui Jiang, Yuan He, Rajiv C. McCoy, Qiaomei Fu, Joshua M. Akey, Yafei Mao, Lu Chen
{"title":"A Refined Analysis of Neanderthal-Introgressed Sequences in Modern Humans with a Complete Reference Genome","authors":"Shen-Ao Liang, Tianxin Ren, Jiayu Zhang, Jiahui He, Xuankai Wang, Xinrui Jiang, Yuan He, Rajiv C. McCoy, Qiaomei Fu, Joshua M. Akey, Yafei Mao, Lu Chen","doi":"10.1101/2024.08.09.607285","DOIUrl":null,"url":null,"abstract":"Background: Leveraging long-read sequencing technologies, the first complete human reference genome, T2T-CHM13, corrects the assembly errors in prior references and addresses the remaining 8% of the genome. While the studies on archaic admixture in modern humans so far have been relying on the GRCh37 reference due to the archaic genome data, the impact of T2T-CHM13 in this field remains unknown. Results: We remapped the sequencing reads of the high-quality Altai Neanderthal and Denisovan genomes onto GRCh38 and T2T-CHM13 respectively. Compared with GRCh37, we found T2T-CHM13 has a significant improvement of read mapping quality in archaic samples. We then applied IBDmix to identify Neanderthal introgressed sequences in 2,504 individuals from 26 geographically diverse populations in different references. We observed different pre-phasing filtering strategies prevalently used in public data can largely impact determination of archaic ancestry, calling for consideration on the choice of filters. We discovered ~51Mb T2T-CHM13 unique Neanderthal sequences, which are predominantly located in regions where the variants distinct between the GRCh38 and T2T-CHM13 assemblies emerge. Besides, we unfolded new instances of population-specific archaic introgression in diverse populations, covering genes involved in metabolism, olfactory-related, and icon-channel. Finally, we integrated the introgressed sequences and adaptive signals with all references into a visualization database website, called ASH (www.arcseqhub.com), to facilitate the utilization of archaic alleles and adaptive signals in human genomics and evolutionary research. Conclusions: Our study refines the detection of archaic variations in modern humans, highlights the importance of T2T-CHM13 reference utility, and provides novel insights into functional consequences of archaic hominin admixture.","PeriodicalId":501183,"journal":{"name":"bioRxiv - Evolutionary Biology","volume":"50 1","pages":""},"PeriodicalIF":0.0000,"publicationDate":"2024-08-10","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"bioRxiv - Evolutionary Biology","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1101/2024.08.09.607285","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 0

Abstract

Background: Leveraging long-read sequencing technologies, the first complete human reference genome, T2T-CHM13, corrects the assembly errors in prior references and addresses the remaining 8% of the genome. While the studies on archaic admixture in modern humans so far have been relying on the GRCh37 reference due to the archaic genome data, the impact of T2T-CHM13 in this field remains unknown. Results: We remapped the sequencing reads of the high-quality Altai Neanderthal and Denisovan genomes onto GRCh38 and T2T-CHM13 respectively. Compared with GRCh37, we found T2T-CHM13 has a significant improvement of read mapping quality in archaic samples. We then applied IBDmix to identify Neanderthal introgressed sequences in 2,504 individuals from 26 geographically diverse populations in different references. We observed different pre-phasing filtering strategies prevalently used in public data can largely impact determination of archaic ancestry, calling for consideration on the choice of filters. We discovered ~51Mb T2T-CHM13 unique Neanderthal sequences, which are predominantly located in regions where the variants distinct between the GRCh38 and T2T-CHM13 assemblies emerge. Besides, we unfolded new instances of population-specific archaic introgression in diverse populations, covering genes involved in metabolism, olfactory-related, and icon-channel. Finally, we integrated the introgressed sequences and adaptive signals with all references into a visualization database website, called ASH (www.arcseqhub.com), to facilitate the utilization of archaic alleles and adaptive signals in human genomics and evolutionary research. Conclusions: Our study refines the detection of archaic variations in modern humans, highlights the importance of T2T-CHM13 reference utility, and provides novel insights into functional consequences of archaic hominin admixture.
利用完整的参考基因组对现代人的尼安德特人进化序列进行精细分析
背景:利用长读数测序技术,第一个完整的人类参考基因组 T2T-CHM13 纠正了之前参考基因组的组装错误,并解决了剩余 8% 的基因组问题。由于存在古人类基因组数据,迄今为止有关现代人中古人类混杂的研究一直依赖于 GRCh37 参考文献,而 T2T-CHM13 在这一领域的影响仍是未知数。结果我们将高质量阿尔泰尼安德特人和丹尼索瓦人基因组的测序读数分别重新映射到 GRCh38 和 T2T-CHM13 上。与 GRCh37 相比,我们发现 T2T-CHM13 在古人类样本中的读数映射质量有显著提高。然后,我们应用 IBDmix 从不同参考文献中的 26 个不同地理种群的 2504 个个体中识别尼安德特人的引入序列。我们观察到,公共数据中普遍使用的不同预分期过滤策略会在很大程度上影响古人类祖先的确定,因此需要对过滤策略的选择加以考虑。我们发现了 ~51Mb T2T-CHM13 独特的尼安德特人序列,这些序列主要位于 GRCh38 和 T2T-CHM13 组合出现不同变异的区域。此外,我们还在不同种群中发现了新的种群特异性古老引物,涉及代谢、嗅觉相关和图标通道基因。最后,我们将这些引入序列和适应信号与所有参考文献整合到一个名为 ASH 的可视化数据库网站上(www.arcseqhub.com),以促进在人类基因组学和进化研究中对古老等位基因和适应信号的利用。结论:我们的研究完善了对现代人古老变异的检测,强调了 T2T-CHM13 参考效用的重要性,并对古人类混杂的功能性后果提供了新的见解。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
求助全文
约1分钟内获得全文 求助全文
来源期刊
自引率
0.00%
发文量
0
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
确定
请完成安全验证×
copy
已复制链接
快去分享给好友吧!
我知道了
右上角分享
点击右上角分享
0
联系我们:info@booksci.cn Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。 Copyright © 2023 布克学术 All rights reserved.
京ICP备2023020795号-1
ghs 京公网安备 11010802042870号
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术官方微信