Approximate Bayesian computational methods to estimate the strength of divergent selection in population genomics models

Martyna Lukaszewicz , Ousseini Issaka Salia , Paul A. Hohenlohe , Erkan O. Buzbas
{"title":"Approximate Bayesian computational methods to estimate the strength of divergent selection in population genomics models","authors":"Martyna Lukaszewicz ,&nbsp;Ousseini Issaka Salia ,&nbsp;Paul A. Hohenlohe ,&nbsp;Erkan O. Buzbas","doi":"10.1016/j.jcmds.2024.100091","DOIUrl":null,"url":null,"abstract":"<div><p>Statistical estimation of parameters in large models of evolutionary processes is often too computationally inefficient to pursue using exact model likelihoods, even with single-nucleotide polymorphism (SNP) data, which offers a way to reduce the size of genetic data while retaining relevant information. Approximate Bayesian Computation (ABC) to perform statistical inference about parameters of large models takes the advantage of simulations to bypass direct evaluation of model likelihoods. We develop a mechanistic model to simulate forward-in-time divergent selection with variable migration rates, modes of reproduction (sexual, asexual), length and number of migration-selection cycles. We investigate the computational feasibility of ABC to perform statistical inference and study the quality of estimates on the position of loci under selection and the strength of selection. To expand the parameter space of positions under selection, we enhance the model by implementing an outlier scan on summarized observed data. We evaluate the usefulness of summary statistics well-known to capture the strength of selection, and assess their informativeness under divergent selection. We also evaluate the effect of genetic drift with respect to an idealized deterministic model with single-locus selection. We discuss the role of the recombination rate as a confounding factor in estimating the strength of divergent selection, and emphasize its importance in break down of linkage disequilibrium (LD). We answer the question for which part of the parameter space of the model we recover strong signal for estimating the selection, and determine whether population differentiation-based summary statistics or LD–based summary statistics perform well in estimating selection.</p></div>","PeriodicalId":100768,"journal":{"name":"Journal of Computational Mathematics and Data Science","volume":"10 ","pages":"Article 100091"},"PeriodicalIF":0.0000,"publicationDate":"2024-02-07","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.sciencedirect.com/science/article/pii/S2772415824000026/pdfft?md5=74c5a713f0b6de0a968b0a22ee2b9d09&pid=1-s2.0-S2772415824000026-main.pdf","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Journal of Computational Mathematics and Data Science","FirstCategoryId":"1085","ListUrlMain":"https://www.sciencedirect.com/science/article/pii/S2772415824000026","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 0

Abstract

Statistical estimation of parameters in large models of evolutionary processes is often too computationally inefficient to pursue using exact model likelihoods, even with single-nucleotide polymorphism (SNP) data, which offers a way to reduce the size of genetic data while retaining relevant information. Approximate Bayesian Computation (ABC) to perform statistical inference about parameters of large models takes the advantage of simulations to bypass direct evaluation of model likelihoods. We develop a mechanistic model to simulate forward-in-time divergent selection with variable migration rates, modes of reproduction (sexual, asexual), length and number of migration-selection cycles. We investigate the computational feasibility of ABC to perform statistical inference and study the quality of estimates on the position of loci under selection and the strength of selection. To expand the parameter space of positions under selection, we enhance the model by implementing an outlier scan on summarized observed data. We evaluate the usefulness of summary statistics well-known to capture the strength of selection, and assess their informativeness under divergent selection. We also evaluate the effect of genetic drift with respect to an idealized deterministic model with single-locus selection. We discuss the role of the recombination rate as a confounding factor in estimating the strength of divergent selection, and emphasize its importance in break down of linkage disequilibrium (LD). We answer the question for which part of the parameter space of the model we recover strong signal for estimating the selection, and determine whether population differentiation-based summary statistics or LD–based summary statistics perform well in estimating selection.

用近似贝叶斯计算方法估算群体基因组学模型中的分化选择强度
对进化过程大型模型参数的统计估计通常计算效率太低,无法使用精确的模型似然,即使使用单核苷酸多态性(SNP)数据也是如此。利用近似贝叶斯计算(ABC)对大型模型的参数进行统计推断,可以利用模拟的优势绕过对模型似然的直接评估。我们建立了一个机理模型,以模拟具有可变迁移率、繁殖模式(有性繁殖、无性繁殖)、迁移-选择周期的长度和数量的前向时间分化选择。我们研究了用 ABC 进行统计推断的计算可行性,并研究了对受选择位点位置和选择强度的估计质量。为了扩展选择位置的参数空间,我们通过对汇总的观测数据进行离群扫描来增强模型。我们评估了众所周知的用于捕捉选择强度的汇总统计数据的有用性,并评估了它们在分歧选择下的信息量。我们还评估了遗传漂变对理想化的单病灶选择确定性模型的影响。我们讨论了重组率作为一个混杂因素在估计发散选择强度中的作用,并强调了它在打破连锁不平衡(LD)中的重要性。我们回答了在模型参数空间的哪一部分我们能恢复到估计选择的强信号这一问题,并确定了是基于种群分化的汇总统计还是基于 LD 的汇总统计在估计选择方面表现良好。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
求助全文
约1分钟内获得全文 求助全文
来源期刊
CiteScore
3.00
自引率
0.00%
发文量
0
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
确定
请完成安全验证×
copy
已复制链接
快去分享给好友吧!
我知道了
右上角分享
点击右上角分享
0
联系我们:info@booksci.cn Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。 Copyright © 2023 布克学术 All rights reserved.
京ICP备2023020795号-1
ghs 京公网安备 11010802042870号
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术官方微信