Effective Population Size Estimation in Large Marine Populations: Considering Current Challenges and Opportunities When Simulating Large Data Sets With High-Density Genomic Information

IF 3.2 2区 生物学 Q1 EVOLUTIONARY BIOLOGY
Chrystelle Delord, Sophie Arnaud-Haond, Agostino Leone, Ekaterina Noskova, Rémi Tournebize, Patrick Jacques, Francis Marsac, Natacha Nikolic
{"title":"Effective Population Size Estimation in Large Marine Populations: Considering Current Challenges and Opportunities When Simulating Large Data Sets With High-Density Genomic Information","authors":"Chrystelle Delord,&nbsp;Sophie Arnaud-Haond,&nbsp;Agostino Leone,&nbsp;Ekaterina Noskova,&nbsp;Rémi Tournebize,&nbsp;Patrick Jacques,&nbsp;Francis Marsac,&nbsp;Natacha Nikolic","doi":"10.1111/eva.70121","DOIUrl":null,"url":null,"abstract":"<p>Next-generation-sequencing has broadened perspectives regarding the estimation of the effective population size (<i>Ne</i>) by providing high-density genomic information. These technologies have expanded data collection and analytical tools in population genetics, increasing understanding of populations with high abundance, such as marine species with high commercial or conservation priority. Several common methods for estimating <i>Ne</i> are based on allele frequency spectra or linkage disequilibrium between loci. However, their specific constraints make it difficult to apply them to large populations, especially with confounding factors such as migration rates, complex sampling schemes or non-independence between loci. Computer simulations have long represented invaluable tools to explore the influence of biological or logistical factors on <i>Ne</i> estimation and to assess the robustness of dedicated methods. Here, we outline several <i>Ne</i> estimation methods and their foundational principles, requirements and likely caveats regarding application to populations of high abundance. Thereafter, we present a simulation framework built upon recent computational genomic tools that combine the possibility to generate biologically realistic data sets with realistic patterns of long-term neutral genetic diversity. This framework aims at reproducing and tracking the main critical features of data derived from a large natural population when running a simulation-based population genetics study, for example, evaluating the strengths and limitations of various <i>Ne</i> estimation methods. We illustrate this framework by generating genotype data sets with varying sample sizes and locus numbers and analysing them with three software tools (NeEstimator2, GONE and GADMA). Detailed and annotated simulation scripts are provided to ensure reproducibility and to support future research on <i>Ne</i> estimation. These resources can support method comparisons and validations, particularly for non-specialists, such as conservation practitioners and students.</p>","PeriodicalId":168,"journal":{"name":"Evolutionary Applications","volume":"18 8","pages":""},"PeriodicalIF":3.2000,"publicationDate":"2025-07-28","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://onlinelibrary.wiley.com/doi/epdf/10.1111/eva.70121","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Evolutionary Applications","FirstCategoryId":"99","ListUrlMain":"https://onlinelibrary.wiley.com/doi/10.1111/eva.70121","RegionNum":2,"RegionCategory":"生物学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q1","JCRName":"EVOLUTIONARY BIOLOGY","Score":null,"Total":0}
引用次数: 0

Abstract

Next-generation-sequencing has broadened perspectives regarding the estimation of the effective population size (Ne) by providing high-density genomic information. These technologies have expanded data collection and analytical tools in population genetics, increasing understanding of populations with high abundance, such as marine species with high commercial or conservation priority. Several common methods for estimating Ne are based on allele frequency spectra or linkage disequilibrium between loci. However, their specific constraints make it difficult to apply them to large populations, especially with confounding factors such as migration rates, complex sampling schemes or non-independence between loci. Computer simulations have long represented invaluable tools to explore the influence of biological or logistical factors on Ne estimation and to assess the robustness of dedicated methods. Here, we outline several Ne estimation methods and their foundational principles, requirements and likely caveats regarding application to populations of high abundance. Thereafter, we present a simulation framework built upon recent computational genomic tools that combine the possibility to generate biologically realistic data sets with realistic patterns of long-term neutral genetic diversity. This framework aims at reproducing and tracking the main critical features of data derived from a large natural population when running a simulation-based population genetics study, for example, evaluating the strengths and limitations of various Ne estimation methods. We illustrate this framework by generating genotype data sets with varying sample sizes and locus numbers and analysing them with three software tools (NeEstimator2, GONE and GADMA). Detailed and annotated simulation scripts are provided to ensure reproducibility and to support future research on Ne estimation. These resources can support method comparisons and validations, particularly for non-specialists, such as conservation practitioners and students.

Abstract Image

大型海洋种群的有效种群规模估计:在模拟高密度基因组信息的大数据集时考虑当前的挑战和机遇
下一代测序通过提供高密度基因组信息,拓宽了有效种群大小(Ne)估算的视角。这些技术扩大了种群遗传学的数据收集和分析工具,增加了对高丰度种群的了解,例如具有高度商业或保护优先权的海洋物种。几种常用的估算Ne的方法是基于等位基因频谱或位点间的连锁不平衡。然而,它们的特定限制使得它们难以应用于大种群,特别是具有诸如迁移率、复杂抽样方案或位点之间不独立等混杂因素。长期以来,计算机模拟一直是探索生物或后勤因素对Ne估计的影响以及评估专用方法的鲁棒性的宝贵工具。在这里,我们概述了几种Ne估计方法及其在高丰度种群中的应用的基本原则、要求和可能的注意事项。此后,我们提出了一个基于最近的计算基因组工具的模拟框架,该工具结合了生成生物学上真实的数据集和长期中性遗传多样性的现实模式的可能性。该框架旨在在运行基于模拟的群体遗传学研究时再现和跟踪来自大型自然种群的数据的主要关键特征,例如,评估各种Ne估计方法的优势和局限性。我们通过生成具有不同样本量和基因座数的基因型数据集,并使用三种软件工具(NeEstimator2、GONE和GADMA)对其进行分析,来说明这一框架。提供了详细和注释的仿真脚本,以确保可重复性并支持未来对Ne估计的研究。这些资源可以支持方法比较和验证,特别是对于非专业人员,如保护从业者和学生。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
求助全文
约1分钟内获得全文 求助全文
来源期刊
Evolutionary Applications
Evolutionary Applications 生物-进化生物学
CiteScore
8.50
自引率
7.30%
发文量
175
审稿时长
6 months
期刊介绍: Evolutionary Applications is a fully peer reviewed open access journal. It publishes papers that utilize concepts from evolutionary biology to address biological questions of health, social and economic relevance. Papers are expected to employ evolutionary concepts or methods to make contributions to areas such as (but not limited to): medicine, agriculture, forestry, exploitation and management (fisheries and wildlife), aquaculture, conservation biology, environmental sciences (including climate change and invasion biology), microbiology, and toxicology. All taxonomic groups are covered from microbes, fungi, plants and animals. In order to better serve the community, we also now strongly encourage submissions of papers making use of modern molecular and genetic methods (population and functional genomics, transcriptomics, proteomics, epigenetics, quantitative genetics, association and linkage mapping) to address important questions in any of these disciplines and in an applied evolutionary framework. Theoretical, empirical, synthesis or perspective papers are welcome.
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
确定
请完成安全验证×
copy
已复制链接
快去分享给好友吧!
我知道了
右上角分享
点击右上角分享
0
联系我们:info@booksci.cn Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。 Copyright © 2023 布克学术 All rights reserved.
京ICP备2023020795号-1
ghs 京公网安备 11010802042870号
Book学术文献互助
Book学术文献互助群
群 号:604180095
Book学术官方微信