Chris C. R. Smith, Gilia Patterson, Peter L. Ralph, Andrew D. Kern
{"title":"Estimation of spatial demographic maps from polymorphism data using a neural network","authors":"Chris C. R. Smith, Gilia Patterson, Peter L. Ralph, Andrew D. Kern","doi":"10.1111/1755-0998.14005","DOIUrl":null,"url":null,"abstract":"<p>A fundamental goal in population genetics is to understand how variation is arrayed over natural landscapes. From first principles we know that common features such as heterogeneous population densities and barriers to dispersal should shape genetic variation over space, however there are few tools currently available that can deal with these ubiquitous complexities. Geographically referenced single nucleotide polymorphism (SNP) data are increasingly accessible, presenting an opportunity to study genetic variation across geographic space in myriad species. We present a new inference method that uses geo-referenced SNPs and a deep neural network to estimate spatially heterogeneous maps of population density and dispersal rate. Our neural network trains on simulated input and output pairings, where the input consists of genotypes and sampling locations generated from a continuous space population genetic simulator, and the output is a map of the true demographic parameters. We benchmark our tool against existing methods and discuss qualitative differences between the different approaches; in particular, our program is unique because it infers the magnitude of both dispersal and density as well as their variation over the landscape, and it does so using SNP data. Similar methods are constrained to estimating relative migration rates, or require identity-by-descent blocks as input. We applied our tool to empirical data from North American grey wolves, for which it estimated mostly reasonable demographic parameters, but was affected by incomplete spatial sampling. Genetic based methods like ours complement other, direct methods for estimating past and present demography, and we believe will serve as valuable tools for applications in conservation, ecology and evolutionary biology. An open source software package implementing our method is available from https://github.com/kr-colab/mapNN.</p>","PeriodicalId":211,"journal":{"name":"Molecular Ecology Resources","volume":"24 7","pages":""},"PeriodicalIF":5.5000,"publicationDate":"2024-08-16","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://onlinelibrary.wiley.com/doi/epdf/10.1111/1755-0998.14005","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Molecular Ecology Resources","FirstCategoryId":"99","ListUrlMain":"https://onlinelibrary.wiley.com/doi/10.1111/1755-0998.14005","RegionNum":1,"RegionCategory":"生物学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q1","JCRName":"BIOCHEMISTRY & MOLECULAR BIOLOGY","Score":null,"Total":0}
引用次数: 0
Abstract
A fundamental goal in population genetics is to understand how variation is arrayed over natural landscapes. From first principles we know that common features such as heterogeneous population densities and barriers to dispersal should shape genetic variation over space, however there are few tools currently available that can deal with these ubiquitous complexities. Geographically referenced single nucleotide polymorphism (SNP) data are increasingly accessible, presenting an opportunity to study genetic variation across geographic space in myriad species. We present a new inference method that uses geo-referenced SNPs and a deep neural network to estimate spatially heterogeneous maps of population density and dispersal rate. Our neural network trains on simulated input and output pairings, where the input consists of genotypes and sampling locations generated from a continuous space population genetic simulator, and the output is a map of the true demographic parameters. We benchmark our tool against existing methods and discuss qualitative differences between the different approaches; in particular, our program is unique because it infers the magnitude of both dispersal and density as well as their variation over the landscape, and it does so using SNP data. Similar methods are constrained to estimating relative migration rates, or require identity-by-descent blocks as input. We applied our tool to empirical data from North American grey wolves, for which it estimated mostly reasonable demographic parameters, but was affected by incomplete spatial sampling. Genetic based methods like ours complement other, direct methods for estimating past and present demography, and we believe will serve as valuable tools for applications in conservation, ecology and evolutionary biology. An open source software package implementing our method is available from https://github.com/kr-colab/mapNN.
群体遗传学的一个基本目标是了解变异是如何在自然景观中排列的。根据第一原理,我们知道异质性种群密度和扩散障碍等共同特征应该会影响空间的遗传变异,但目前能处理这些无处不在的复杂性的工具却很少。以地理位置为参照的单核苷酸多态性(SNP)数据越来越容易获取,这为研究无数物种跨地理空间的遗传变异提供了机会。我们提出了一种新的推断方法,利用地理参照 SNP 和深度神经网络来估计种群密度和扩散率的空间异质性图谱。我们的神经网络对模拟输入和输出配对进行训练,其中输入由连续空间种群遗传模拟器生成的基因型和采样位置组成,输出则是真实人口统计参数图。我们将我们的工具与现有的方法进行比较,并讨论不同方法之间的本质区别;特别是,我们的程序是独一无二的,因为它可以推断出扩散和密度的大小以及它们在地形上的变化,而且是通过 SNP 数据实现的。类似的方法只能估算相对迁移率,或者需要输入同源区块。我们将我们的工具应用于北美灰狼的经验数据中,它估算出的人口统计参数基本合理,但受到空间采样不完整的影响。像我们这样基于遗传学的方法是对其他估算过去和现在人口统计的直接方法的补充,我们相信它将成为保护、生态学和进化生物学应用的宝贵工具。实现我们方法的开源软件包可从 https://github.com/kr-colab/mapNN 获取。
期刊介绍:
Molecular Ecology Resources promotes the creation of comprehensive resources for the scientific community, encompassing computer programs, statistical and molecular advancements, and a diverse array of molecular tools. Serving as a conduit for disseminating these resources, the journal targets a broad audience of researchers in the fields of evolution, ecology, and conservation. Articles in Molecular Ecology Resources are crafted to support investigations tackling significant questions within these disciplines.
In addition to original resource articles, Molecular Ecology Resources features Reviews, Opinions, and Comments relevant to the field. The journal also periodically releases Special Issues focusing on resource development within specific areas.