Arun Sethuraman, Melissa Lynch, Margaret Wanjiku, Michael Kuzminskiy
{"title":"在估计进化史的同时,从未采样的幽灵种群中计算基因流。","authors":"Arun Sethuraman, Melissa Lynch, Margaret Wanjiku, Michael Kuzminskiy","doi":"10.1093/g3journal/jkaf180","DOIUrl":null,"url":null,"abstract":"<p><p>Gene flow from unsampled or extinct ghost populations leave signatures on the genomes of individuals from extant, sampled populations, often introducing biases, data misinterpretation, and ambiguous results when estimating evolutionary history from population genomic data. Here we establish theoretical expectations for these biases, and then utilize extensive simulations under a variety of ghost topologies to systematically assess biases while accounting, or not accounting for gene flow from ghost populations in (i) population genetics summary statistics such as π, FST, and Tajima's D and (ii) demographic history (mutation-scaled effective population sizes, divergence times, and migration rates) under the Isolation with Migration (IM) model. Estimates of evolutionary history across all scenarios of deep divergence of an outgroup ghost indicate consistent (i) under-estimation of divergence times between sampled populations, (ii) over-estimation of effective population sizes of sampled populations, and (iii) under-estimation of migration rates between sampled populations, with increased gene flow from the unsampled ghost population. Without accounting for an unsampled ghost, summary statistics like FST are under-estimated, and π is over-estimated with increased gene flow from the ghost. These biases in summary statistics and population structure are however not captured under models of recent IM that approximate scales of the evolution of anatomically modern humans and Neanderthals and solely recapitulated using model-based estimation of evolutionary history. We also utilize a 355 locus dataset from African Hunter-Gatherer populations and discuss similar biases in estimating evolutionary history while not accounting for unsampled ghost.</p>","PeriodicalId":12468,"journal":{"name":"G3: Genes|Genomes|Genetics","volume":" ","pages":""},"PeriodicalIF":2.2000,"publicationDate":"2025-10-08","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC12506660/pdf/","citationCount":"0","resultStr":"{\"title\":\"Accounting for gene flow from unsampled ghost populations while estimating evolutionary history.\",\"authors\":\"Arun Sethuraman, Melissa Lynch, Margaret Wanjiku, Michael Kuzminskiy\",\"doi\":\"10.1093/g3journal/jkaf180\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"<p><p>Gene flow from unsampled or extinct ghost populations leave signatures on the genomes of individuals from extant, sampled populations, often introducing biases, data misinterpretation, and ambiguous results when estimating evolutionary history from population genomic data. Here we establish theoretical expectations for these biases, and then utilize extensive simulations under a variety of ghost topologies to systematically assess biases while accounting, or not accounting for gene flow from ghost populations in (i) population genetics summary statistics such as π, FST, and Tajima's D and (ii) demographic history (mutation-scaled effective population sizes, divergence times, and migration rates) under the Isolation with Migration (IM) model. Estimates of evolutionary history across all scenarios of deep divergence of an outgroup ghost indicate consistent (i) under-estimation of divergence times between sampled populations, (ii) over-estimation of effective population sizes of sampled populations, and (iii) under-estimation of migration rates between sampled populations, with increased gene flow from the unsampled ghost population. Without accounting for an unsampled ghost, summary statistics like FST are under-estimated, and π is over-estimated with increased gene flow from the ghost. These biases in summary statistics and population structure are however not captured under models of recent IM that approximate scales of the evolution of anatomically modern humans and Neanderthals and solely recapitulated using model-based estimation of evolutionary history. We also utilize a 355 locus dataset from African Hunter-Gatherer populations and discuss similar biases in estimating evolutionary history while not accounting for unsampled ghost.</p>\",\"PeriodicalId\":12468,\"journal\":{\"name\":\"G3: Genes|Genomes|Genetics\",\"volume\":\" \",\"pages\":\"\"},\"PeriodicalIF\":2.2000,\"publicationDate\":\"2025-10-08\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC12506660/pdf/\",\"citationCount\":\"0\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"G3: Genes|Genomes|Genetics\",\"FirstCategoryId\":\"99\",\"ListUrlMain\":\"https://doi.org/10.1093/g3journal/jkaf180\",\"RegionNum\":3,\"RegionCategory\":\"生物学\",\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"Q3\",\"JCRName\":\"GENETICS & HEREDITY\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"G3: Genes|Genomes|Genetics","FirstCategoryId":"99","ListUrlMain":"https://doi.org/10.1093/g3journal/jkaf180","RegionNum":3,"RegionCategory":"生物学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q3","JCRName":"GENETICS & HEREDITY","Score":null,"Total":0}
Accounting for gene flow from unsampled ghost populations while estimating evolutionary history.
Gene flow from unsampled or extinct ghost populations leave signatures on the genomes of individuals from extant, sampled populations, often introducing biases, data misinterpretation, and ambiguous results when estimating evolutionary history from population genomic data. Here we establish theoretical expectations for these biases, and then utilize extensive simulations under a variety of ghost topologies to systematically assess biases while accounting, or not accounting for gene flow from ghost populations in (i) population genetics summary statistics such as π, FST, and Tajima's D and (ii) demographic history (mutation-scaled effective population sizes, divergence times, and migration rates) under the Isolation with Migration (IM) model. Estimates of evolutionary history across all scenarios of deep divergence of an outgroup ghost indicate consistent (i) under-estimation of divergence times between sampled populations, (ii) over-estimation of effective population sizes of sampled populations, and (iii) under-estimation of migration rates between sampled populations, with increased gene flow from the unsampled ghost population. Without accounting for an unsampled ghost, summary statistics like FST are under-estimated, and π is over-estimated with increased gene flow from the ghost. These biases in summary statistics and population structure are however not captured under models of recent IM that approximate scales of the evolution of anatomically modern humans and Neanderthals and solely recapitulated using model-based estimation of evolutionary history. We also utilize a 355 locus dataset from African Hunter-Gatherer populations and discuss similar biases in estimating evolutionary history while not accounting for unsampled ghost.
期刊介绍:
G3: Genes, Genomes, Genetics provides a forum for the publication of high‐quality foundational research, particularly research that generates useful genetic and genomic information such as genome maps, single gene studies, genome‐wide association and QTL studies, as well as genome reports, mutant screens, and advances in methods and technology. The Editorial Board of G3 believes that rapid dissemination of these data is the necessary foundation for analysis that leads to mechanistic insights.
G3, published by the Genetics Society of America, meets the critical and growing need of the genetics community for rapid review and publication of important results in all areas of genetics. G3 offers the opportunity to publish the puzzling finding or to present unpublished results that may not have been submitted for review and publication due to a perceived lack of a potential high-impact finding. G3 has earned the DOAJ Seal, which is a mark of certification for open access journals, awarded by DOAJ to journals that achieve a high level of openness, adhere to Best Practice and high publishing standards.