{"title":"群体遗传学模拟中的参数缩放可能引入意想不到的背景选择:缩放模拟设计的考虑。","authors":"Tessa Ferrari, Siyuan Feng, Xinjun Zhang, Jazlyn Mooney","doi":"10.1093/gbe/evaf097","DOIUrl":null,"url":null,"abstract":"<p><p>Scaling is a common practice in population genetic simulations to increase computational efficiency. However, few studies systematically examine the effects of scaling on diversity estimates and the comparability of scaled results to unscaled simulations and empirical data. We investigate the effects of scaling in two species, modern humans and Drosophila melanogaster. These species have stark differences in population size and generation time, necessitating moderate-to-no scaling for humans and dramatic scaling for Drosophila. We determine how coalescence, runtime, memory, estimates of diversity, the site frequency spectra, and linkage disequilibrium are influenced by scaling. We also examine the impact of simulated segment length and burn-in time on these metrics. Our results demonstrate that while computational efficiency improves with scaling, large scaling factors distort genetic diversity and dynamics between genetic variants, resulting in deviations from the intended model and empirical observations. Specifically, strongly scaled simulations may experience stronger negative selection on deleterious mutations, which amplifies background selection and purges linked mutations, leaving only rare strongly deleterious variants in the final population. We additionally show that a heuristic burn-in length of 10N generations is often insufficient for full coalescence in both models and alters expected linkage disequilibrium patterns. Finally, we provide considerations for conducting scaled simulations and offer potential strategies for the mitigation of scaling effects. For most non-model species simulations, we advocate for a bespoke scaling strategy drawn from these use-cases.</p>","PeriodicalId":12779,"journal":{"name":"Genome Biology and Evolution","volume":" ","pages":""},"PeriodicalIF":3.2000,"publicationDate":"2025-05-23","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":"{\"title\":\"Parameter Scaling in Population Genetics Simulations May Introduce Unintended Background Selection: Considerations for Scaled Simulation Design.\",\"authors\":\"Tessa Ferrari, Siyuan Feng, Xinjun Zhang, Jazlyn Mooney\",\"doi\":\"10.1093/gbe/evaf097\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"<p><p>Scaling is a common practice in population genetic simulations to increase computational efficiency. However, few studies systematically examine the effects of scaling on diversity estimates and the comparability of scaled results to unscaled simulations and empirical data. We investigate the effects of scaling in two species, modern humans and Drosophila melanogaster. These species have stark differences in population size and generation time, necessitating moderate-to-no scaling for humans and dramatic scaling for Drosophila. We determine how coalescence, runtime, memory, estimates of diversity, the site frequency spectra, and linkage disequilibrium are influenced by scaling. We also examine the impact of simulated segment length and burn-in time on these metrics. Our results demonstrate that while computational efficiency improves with scaling, large scaling factors distort genetic diversity and dynamics between genetic variants, resulting in deviations from the intended model and empirical observations. Specifically, strongly scaled simulations may experience stronger negative selection on deleterious mutations, which amplifies background selection and purges linked mutations, leaving only rare strongly deleterious variants in the final population. We additionally show that a heuristic burn-in length of 10N generations is often insufficient for full coalescence in both models and alters expected linkage disequilibrium patterns. Finally, we provide considerations for conducting scaled simulations and offer potential strategies for the mitigation of scaling effects. For most non-model species simulations, we advocate for a bespoke scaling strategy drawn from these use-cases.</p>\",\"PeriodicalId\":12779,\"journal\":{\"name\":\"Genome Biology and Evolution\",\"volume\":\" \",\"pages\":\"\"},\"PeriodicalIF\":3.2000,\"publicationDate\":\"2025-05-23\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"0\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"Genome Biology and Evolution\",\"FirstCategoryId\":\"99\",\"ListUrlMain\":\"https://doi.org/10.1093/gbe/evaf097\",\"RegionNum\":2,\"RegionCategory\":\"生物学\",\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"Q2\",\"JCRName\":\"EVOLUTIONARY BIOLOGY\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"Genome Biology and Evolution","FirstCategoryId":"99","ListUrlMain":"https://doi.org/10.1093/gbe/evaf097","RegionNum":2,"RegionCategory":"生物学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q2","JCRName":"EVOLUTIONARY BIOLOGY","Score":null,"Total":0}
Parameter Scaling in Population Genetics Simulations May Introduce Unintended Background Selection: Considerations for Scaled Simulation Design.
Scaling is a common practice in population genetic simulations to increase computational efficiency. However, few studies systematically examine the effects of scaling on diversity estimates and the comparability of scaled results to unscaled simulations and empirical data. We investigate the effects of scaling in two species, modern humans and Drosophila melanogaster. These species have stark differences in population size and generation time, necessitating moderate-to-no scaling for humans and dramatic scaling for Drosophila. We determine how coalescence, runtime, memory, estimates of diversity, the site frequency spectra, and linkage disequilibrium are influenced by scaling. We also examine the impact of simulated segment length and burn-in time on these metrics. Our results demonstrate that while computational efficiency improves with scaling, large scaling factors distort genetic diversity and dynamics between genetic variants, resulting in deviations from the intended model and empirical observations. Specifically, strongly scaled simulations may experience stronger negative selection on deleterious mutations, which amplifies background selection and purges linked mutations, leaving only rare strongly deleterious variants in the final population. We additionally show that a heuristic burn-in length of 10N generations is often insufficient for full coalescence in both models and alters expected linkage disequilibrium patterns. Finally, we provide considerations for conducting scaled simulations and offer potential strategies for the mitigation of scaling effects. For most non-model species simulations, we advocate for a bespoke scaling strategy drawn from these use-cases.
期刊介绍:
About the journal
Genome Biology and Evolution (GBE) publishes leading original research at the interface between evolutionary biology and genomics. Papers considered for publication report novel evolutionary findings that concern natural genome diversity, population genomics, the structure, function, organisation and expression of genomes, comparative genomics, proteomics, and environmental genomic interactions. Major evolutionary insights from the fields of computational biology, structural biology, developmental biology, and cell biology are also considered, as are theoretical advances in the field of genome evolution. GBE’s scope embraces genome-wide evolutionary investigations at all taxonomic levels and for all forms of life — within populations or across domains. Its aims are to further the understanding of genomes in their evolutionary context and further the understanding of evolution from a genome-wide perspective.