群体遗传学模拟中的参数缩放可能引入意想不到的背景选择：缩放模拟设计的考虑。

IF 3.2 2区生物学 Q2 EVOLUTIONARY BIOLOGY

Genome Biology and Evolution Pub Date : 2025-05-23 DOI:10.1093/gbe/evaf097

Tessa Ferrari, Siyuan Feng, Xinjun Zhang, Jazlyn Mooney

{"title":"群体遗传学模拟中的参数缩放可能引入意想不到的背景选择：缩放模拟设计的考虑。","authors":"Tessa Ferrari, Siyuan Feng, Xinjun Zhang, Jazlyn Mooney","doi":"10.1093/gbe/evaf097","DOIUrl":null,"url":null,"abstract":"Scaling is a common practice in population genetic simulations to increase computational efficiency. However, few studies systematically examine the effects of scaling on diversity estimates and the comparability of scaled results to unscaled simulations and empirical data. We investigate the effects of scaling in two species, modern humans and Drosophila melanogaster. These species have stark differences in population size and generation time, necessitating moderate-to-no scaling for humans and dramatic scaling for Drosophila. We determine how coalescence, runtime, memory, estimates of diversity, the site frequency spectra, and linkage disequilibrium are influenced by scaling. We also examine the impact of simulated segment length and burn-in time on these metrics. Our results demonstrate that while computational efficiency improves with scaling, large scaling factors distort genetic diversity and dynamics between genetic variants, resulting in deviations from the intended model and empirical observations. Specifically, strongly scaled simulations may experience stronger negative selection on deleterious mutations, which amplifies background selection and purges linked mutations, leaving only rare strongly deleterious variants in the final population. We additionally show that a heuristic burn-in length of 10N generations is often insufficient for full coalescence in both models and alters expected linkage disequilibrium patterns. Finally, we provide considerations for conducting scaled simulations and offer potential strategies for the mitigation of scaling effects. For most non-model species simulations, we advocate for a bespoke scaling strategy drawn from these use-cases.","PeriodicalId":12779,"journal":{"name":"Genome Biology and Evolution","volume":" ","pages":""},"PeriodicalIF":3.2000,"publicationDate":"2025-05-23","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":"{\"title\":\"Parameter Scaling in Population Genetics Simulations May Introduce Unintended Background Selection: Considerations for Scaled Simulation Design.\",\"authors\":\"Tessa Ferrari, Siyuan Feng, Xinjun Zhang, Jazlyn Mooney\",\"doi\":\"10.1093/gbe/evaf097\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"Scaling is a common practice in population genetic simulations to increase computational efficiency. However, few studies systematically examine the effects of scaling on diversity estimates and the comparability of scaled results to unscaled simulations and empirical data. We investigate the effects of scaling in two species, modern humans and Drosophila melanogaster. These species have stark differences in population size and generation time, necessitating moderate-to-no scaling for humans and dramatic scaling for Drosophila. We determine how coalescence, runtime, memory, estimates of diversity, the site frequency spectra, and linkage disequilibrium are influenced by scaling. We also examine the impact of simulated segment length and burn-in time on these metrics. Our results demonstrate that while computational efficiency improves with scaling, large scaling factors distort genetic diversity and dynamics between genetic variants, resulting in deviations from the intended model and empirical observations. Specifically, strongly scaled simulations may experience stronger negative selection on deleterious mutations, which amplifies background selection and purges linked mutations, leaving only rare strongly deleterious variants in the final population. We additionally show that a heuristic burn-in length of 10N generations is often insufficient for full coalescence in both models and alters expected linkage disequilibrium patterns. Finally, we provide considerations for conducting scaled simulations and offer potential strategies for the mitigation of scaling effects. For most non-model species simulations, we advocate for a bespoke scaling strategy drawn from these use-cases.\",\"PeriodicalId\":12779,\"journal\":{\"name\":\"Genome Biology and Evolution\",\"volume\":\" \",\"pages\":\"\"},\"PeriodicalIF\":3.2000,\"publicationDate\":\"2025-05-23\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"0\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"Genome Biology and Evolution\",\"FirstCategoryId\":\"99\",\"ListUrlMain\":\"https://doi.org/10.1093/gbe/evaf097\",\"RegionNum\":2,\"RegionCategory\":\"生物学\",\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"Q2\",\"JCRName\":\"EVOLUTIONARY BIOLOGY\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"Genome Biology and Evolution","FirstCategoryId":"99","ListUrlMain":"https://doi.org/10.1093/gbe/evaf097","RegionNum":2,"RegionCategory":"生物学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q2","JCRName":"EVOLUTIONARY BIOLOGY","Score":null,"Total":0}

引用次数: 0

摘要

缩放是群体遗传模拟中提高计算效率的常用方法。然而，很少有研究系统地考察尺度对多样性估计的影响，以及尺度结果与非尺度模拟和经验数据的可比性。我们研究了两个物种，现代人类和黑腹果蝇的影响。这些物种在种群规模和繁殖时间上有着明显的差异，人类需要适度或不缩放，而果蝇则需要戏剧性的缩放。我们确定了聚结、运行时间、记忆、多样性估计、位点频谱和链接不平衡如何受到缩放的影响。我们还研究了模拟段长度和老化时间对这些指标的影响。我们的研究结果表明，虽然计算效率随着比例的增加而提高，但较大的比例因子会扭曲遗传多样性和遗传变异之间的动态，导致与预期模型和经验观察结果的偏差。具体来说，强比例模拟可能会对有害突变经历更强的负选择，这放大了背景选择并清除了相关突变，在最终种群中只留下罕见的强有害变异。此外，我们还表明，在两个模型中，10N代的启发式老化长度通常不足以完全合并，并改变了预期的连锁不平衡模式。最后，我们提供了进行尺度模拟的考虑因素，并提供了减缓尺度效应的潜在策略。对于大多数非模型物种模拟，我们提倡从这些用例中提取定制的缩放策略。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

查看原文本刊更多论文

Parameter Scaling in Population Genetics Simulations May Introduce Unintended Background Selection: Considerations for Scaled Simulation Design.

Scaling is a common practice in population genetic simulations to increase computational efficiency. However, few studies systematically examine the effects of scaling on diversity estimates and the comparability of scaled results to unscaled simulations and empirical data. We investigate the effects of scaling in two species, modern humans and Drosophila melanogaster. These species have stark differences in population size and generation time, necessitating moderate-to-no scaling for humans and dramatic scaling for Drosophila. We determine how coalescence, runtime, memory, estimates of diversity, the site frequency spectra, and linkage disequilibrium are influenced by scaling. We also examine the impact of simulated segment length and burn-in time on these metrics. Our results demonstrate that while computational efficiency improves with scaling, large scaling factors distort genetic diversity and dynamics between genetic variants, resulting in deviations from the intended model and empirical observations. Specifically, strongly scaled simulations may experience stronger negative selection on deleterious mutations, which amplifies background selection and purges linked mutations, leaving only rare strongly deleterious variants in the final population. We additionally show that a heuristic burn-in length of 10N generations is often insufficient for full coalescence in both models and alters expected linkage disequilibrium patterns. Finally, we provide considerations for conducting scaled simulations and offer potential strategies for the mitigation of scaling effects. For most non-model species simulations, we advocate for a bespoke scaling strategy drawn from these use-cases.

求助全文

通过发布文献求助，成功后即可免费获取论文全文。去求助

来源期刊

Genome Biology and Evolution EVOLUTIONARY BIOLOGY-GENETICS & HEREDITY

CiteScore

5.80

自引率

6.10%

发文量

169

审稿时长

1 months

期刊介绍： About the journal Genome Biology and Evolution (GBE) publishes leading original research at the interface between evolutionary biology and genomics. Papers considered for publication report novel evolutionary findings that concern natural genome diversity, population genomics, the structure, function, organisation and expression of genomes, comparative genomics, proteomics, and environmental genomic interactions. Major evolutionary insights from the fields of computational biology, structural biology, developmental biology, and cell biology are also considered, as are theoretical advances in the field of genome evolution. GBE’s scope embraces genome-wide evolutionary investigations at all taxonomic levels and for all forms of life — within populations or across domains. Its aims are to further the understanding of genomes in their evolutionary context and further the understanding of evolution from a genome-wide perspective.