{"title":"Assessing the limits of local ancestry inference from small reference panels","authors":"Sandra Oliveira, Nina Marchi, Laurent Excoffier","doi":"10.1111/1755-0998.13981","DOIUrl":null,"url":null,"abstract":"<p>Admixture is a common biological phenomenon among populations of the same or different species. Identifying admixed tracts within individual genomes can provide valuable information to date admixture events, reconstruct ancestry-specific demographic histories, or detect adaptive introgression, genetic incompatibilities, as well as regions of the genomes affected by (associative-) overdominance. Although many local ancestry inference (LAI) methods have been developed in the last decade, their performance was accessed using large reference panels, which are rarely available for non-model organisms or ancient samples. Moreover, the demographic conditions for which LAI becomes unreliable have not been explicitly outlined. Here, we identify the demographic conditions for which local ancestries can be best estimated using very small reference panels. Furthermore, we compare the performance of two LAI methods (RFMix and MOSAIC) with the performance of a newly developed approach (simpLAI) that can be used even when reference populations consist of single individuals. Based on simulations of various demographic models, we also determine the limits of these LAI tools and propose post-painting filtering steps to reduce false-positive rates and improve the precision and accuracy of the inferred admixed tracts. Besides providing a guide for using LAI, our work shows that reasonable inferences can be obtained from a single diploid genome per reference under demographic conditions that are not uncommon among past human groups and non-model organisms.</p>","PeriodicalId":211,"journal":{"name":"Molecular Ecology Resources","volume":null,"pages":null},"PeriodicalIF":5.5000,"publicationDate":"2024-05-22","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://onlinelibrary.wiley.com/doi/epdf/10.1111/1755-0998.13981","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Molecular Ecology Resources","FirstCategoryId":"99","ListUrlMain":"https://onlinelibrary.wiley.com/doi/10.1111/1755-0998.13981","RegionNum":1,"RegionCategory":"生物学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q1","JCRName":"BIOCHEMISTRY & MOLECULAR BIOLOGY","Score":null,"Total":0}
引用次数: 0
Abstract
Admixture is a common biological phenomenon among populations of the same or different species. Identifying admixed tracts within individual genomes can provide valuable information to date admixture events, reconstruct ancestry-specific demographic histories, or detect adaptive introgression, genetic incompatibilities, as well as regions of the genomes affected by (associative-) overdominance. Although many local ancestry inference (LAI) methods have been developed in the last decade, their performance was accessed using large reference panels, which are rarely available for non-model organisms or ancient samples. Moreover, the demographic conditions for which LAI becomes unreliable have not been explicitly outlined. Here, we identify the demographic conditions for which local ancestries can be best estimated using very small reference panels. Furthermore, we compare the performance of two LAI methods (RFMix and MOSAIC) with the performance of a newly developed approach (simpLAI) that can be used even when reference populations consist of single individuals. Based on simulations of various demographic models, we also determine the limits of these LAI tools and propose post-painting filtering steps to reduce false-positive rates and improve the precision and accuracy of the inferred admixed tracts. Besides providing a guide for using LAI, our work shows that reasonable inferences can be obtained from a single diploid genome per reference under demographic conditions that are not uncommon among past human groups and non-model organisms.
混杂是同一物种或不同物种种群间常见的生物现象。识别个体基因组中的混杂区可以为确定混杂事件的日期、重建特定祖先的人口历史、检测适应性引入、遗传不相容性以及受(关联)过度优势影响的基因组区域提供有价值的信息。尽管在过去十年中已经开发出了许多本地祖先推断(LAI)方法,但这些方法的性能都是通过大型参考面板获得的,而这些面板很少能用于非模式生物或古代样本。此外,LAI变得不可靠的人口学条件也没有明确概述。在此,我们确定了在哪些人口统计学条件下,使用极小的参考样板可以最好地估计本地祖先。此外,我们还将两种 LAI 方法(RFMix 和 MOSAIC)的性能与一种新开发的方法(simpLAI)的性能进行了比较。基于各种人口统计模型的模拟,我们还确定了这些 LAI 工具的局限性,并提出了绘制后过滤步骤,以降低假阳性率,提高推断出的掺杂道的精确度和准确性。除了为使用 LAI 提供指导外,我们的工作还表明,在过去人类群体和非模式生物中并不少见的人口统计条件下,每个参照物只需一个二倍体基因组就能获得合理的推断。
期刊介绍:
Molecular Ecology Resources promotes the creation of comprehensive resources for the scientific community, encompassing computer programs, statistical and molecular advancements, and a diverse array of molecular tools. Serving as a conduit for disseminating these resources, the journal targets a broad audience of researchers in the fields of evolution, ecology, and conservation. Articles in Molecular Ecology Resources are crafted to support investigations tackling significant questions within these disciplines.
In addition to original resource articles, Molecular Ecology Resources features Reviews, Opinions, and Comments relevant to the field. The journal also periodically releases Special Issues focusing on resource development within specific areas.