Testing times: disentangling admixture histories in recent and complex demographies using ancient DNA.

IF 3.3 3区生物学 Q2 GENETICS & HEREDITY

Genetics Pub Date : 2024-09-04 DOI:10.1093/genetics/iyae110

Matthew P Williams, Pavel Flegontov, Robert Maier, Christian D Huber

{"title":"Testing times: disentangling admixture histories in recent and complex demographies using ancient DNA.","authors":"Matthew P Williams, Pavel Flegontov, Robert Maier, Christian D Huber","doi":"10.1093/genetics/iyae110","DOIUrl":null,"url":null,"abstract":"<p><p>Our knowledge of human evolutionary history has been greatly advanced by paleogenomics. Since the 2020s, the study of ancient DNA has increasingly focused on reconstructing the recent past. However, the accuracy of paleogenomic methods in resolving questions of historical and archaeological importance amidst the increased demographic complexity and decreased genetic differentiation remains an open question. We evaluated the performance and behavior of two commonly used methods, qpAdm and the f3-statistic, on admixture inference under a diversity of demographic models and data conditions. We performed two complementary simulation approaches-firstly exploring a wide demographic parameter space under four simple demographic models of varying complexities and configurations using branch-length data from two chromosomes-and secondly, we analyzed a model of Eurasian history composed of 59 populations using whole-genome data modified with ancient DNA conditions such as SNP ascertainment, data missingness, and pseudohaploidization. We observe that population differentiation is the primary factor driving qpAdm performance. Notably, while complex gene flow histories influence which models are classified as plausible, they do not reduce overall performance. Under conditions reflective of the historical period, qpAdm most frequently identifies the true model as plausible among a small candidate set of closely related populations. To increase the utility for resolving fine-scaled hypotheses, we provide a heuristic for further distinguishing between candidate models that incorporates qpAdm model P-values and f3-statistics. Finally, we demonstrate a significant performance increase for qpAdm using whole-genome branch-length f2-statistics, highlighting the potential for improved demographic inference that could be achieved with future advancements in f-statistic estimations.</p>","PeriodicalId":48925,"journal":{"name":"Genetics","volume":" ","pages":""},"PeriodicalIF":3.3000,"publicationDate":"2024-09-04","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC11373510/pdf/","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Genetics","FirstCategoryId":"99","ListUrlMain":"https://doi.org/10.1093/genetics/iyae110","RegionNum":3,"RegionCategory":"生物学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q2","JCRName":"GENETICS & HEREDITY","Score":null,"Total":0}

引用次数: 0

Abstract

Our knowledge of human evolutionary history has been greatly advanced by paleogenomics. Since the 2020s, the study of ancient DNA has increasingly focused on reconstructing the recent past. However, the accuracy of paleogenomic methods in resolving questions of historical and archaeological importance amidst the increased demographic complexity and decreased genetic differentiation remains an open question. We evaluated the performance and behavior of two commonly used methods, qpAdm and the f3-statistic, on admixture inference under a diversity of demographic models and data conditions. We performed two complementary simulation approaches-firstly exploring a wide demographic parameter space under four simple demographic models of varying complexities and configurations using branch-length data from two chromosomes-and secondly, we analyzed a model of Eurasian history composed of 59 populations using whole-genome data modified with ancient DNA conditions such as SNP ascertainment, data missingness, and pseudohaploidization. We observe that population differentiation is the primary factor driving qpAdm performance. Notably, while complex gene flow histories influence which models are classified as plausible, they do not reduce overall performance. Under conditions reflective of the historical period, qpAdm most frequently identifies the true model as plausible among a small candidate set of closely related populations. To increase the utility for resolving fine-scaled hypotheses, we provide a heuristic for further distinguishing between candidate models that incorporates qpAdm model P-values and f3-statistics. Finally, we demonstrate a significant performance increase for qpAdm using whole-genome branch-length f2-statistics, highlighting the potential for improved demographic inference that could be achieved with future advancements in f-statistic estimations.

查看原文本刊更多论文

测试时代：利用古 DNA 分解新近人口和复杂人口的混血历史。

古基因组学极大地促进了我们对人类进化史的了解。自 20 世纪 20 年代以来，对古 DNA 的研究越来越侧重于重建近代历史。然而，在人口复杂性增加、遗传分化减少的情况下，古基因组学方法在解决历史和考古学重要问题方面的准确性仍然是一个未决问题。我们评估了 qpAdm 和 f3 统计量这两种常用方法在多种人口统计模型和数据条件下进行混杂推断的性能和行为。我们采用了两种互补的模拟方法：首先，我们利用来自两条染色体的分支长度数据，探索了四种不同复杂程度和配置的简单人口统计模型下的广阔人口统计参数空间；其次，我们利用全基因组数据分析了由 59 个种群组成的欧亚大陆历史模型，这些数据是在 SNP 确定、数据缺失和假单倍体化等古代 DNA 条件下修改过的。我们发现，种群分化是驱动 qpAdm 性能的主要因素。值得注意的是，虽然复杂的基因流历史会影响哪些模型被归类为可信模型，但它们并不会降低整体性能。在反映历史时期的条件下，qpAdm 最常在一小部分密切相关的候选种群中识别出真正的可信模型。为了提高解决微尺度假说的效用，我们提供了一种启发式方法来进一步区分候选模型，该方法结合了 qpAdm 模型的 P 值和 f3 统计量。最后，我们展示了使用全基因组分支长度 f2 统计量对 qpAdm 性能的显著提高，突出了未来 f 统计量估算的进步在改进人口推断方面的潜力。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

求助全文

约1分钟内获得全文求助全文

来源期刊

Genetics GENETICS & HEREDITY-

CiteScore

6.90

自引率

6.10%

发文量

177

审稿时长

1.5 months

期刊介绍： GENETICS is published by the Genetics Society of America, a scholarly society that seeks to deepen our understanding of the living world by advancing our understanding of genetics. Since 1916, GENETICS has published high-quality, original research presenting novel findings bearing on genetics and genomics. The journal publishes empirical studies of organisms ranging from microbes to humans, as well as theoretical work. While it has an illustrious history, GENETICS has changed along with the communities it serves: it is not your mentor''s journal. The editors make decisions quickly – in around 30 days – without sacrificing the excellence and scholarship for which the journal has long been known. GENETICS is a peer reviewed, peer-edited journal, with an international reach and increasing visibility and impact. All editorial decisions are made through collaboration of at least two editors who are practicing scientists. GENETICS is constantly innovating: expanded types of content include Reviews, Commentary (current issues of interest to geneticists), Perspectives (historical), Primers (to introduce primary literature into the classroom), Toolbox Reviews, plus YeastBook, FlyBook, and WormBook (coming spring 2016). For particularly time-sensitive results, we publish Communications. As part of our mission to serve our communities, we''ve published thematic collections, including Genomic Selection, Multiparental Populations, Mouse Collaborative Cross, and the Genetics of Sex.