从时间序列遗传数据推断一般二倍体选择的一种新的期望最大化方法。

IF 3.7 2区生物学 Q1 GENETICS & HEREDITY

PLoS Genetics Pub Date : 2025-07-22 eCollection Date: 2025-07-01 DOI:10.1371/journal.pgen.1011769

Adam G Fine, Matthias Steinrücken

{"title":"从时间序列遗传数据推断一般二倍体选择的一种新的期望最大化方法。","authors":"Adam G Fine, Matthias Steinrücken","doi":"10.1371/journal.pgen.1011769","DOIUrl":null,"url":null,"abstract":"Detecting and quantifying the strength of selection is a major objective in population genetics. Since selection acts over multiple generations, many approaches have been developed to detect and quantify selection using genetic data sampled at multiple points in time. Such time-series genetic data is commonly analyzed using Hidden Markov Models, but in most cases, under the assumption of additive selection. However, many examples of genetic variation exhibiting non-additive mechanisms exist, making it critical to develop methods that can characterize selection in more general scenarios. Here, we extend a previously introduced expectation-maximization algorithm for the inference of additive selection coefficients to the case of general diploid selection, in which the heterozygote and homozygote fitness are parameterized independently. We furthermore introduce a framework to identify bespoke modes of diploid selection from given data, a heuristic to account for variable population size, and a procedure for aggregating data across linked loci to increase power and robustness. Using extensive simulation studies, we find that our method accurately and efficiently estimates selection coefficients for different modes of diploid selection across a wide range of scenarios; however, power to classify the mode of selection is low unless selection is very strong. We apply our method to ancient DNA samples from Great Britain in the last 4,450 years and detect evidence for selection in six genomic regions, including the well-characterized LCT locus. Our work is the first genome-wide scan characterizing signals of general diploid selection.","PeriodicalId":49007,"journal":{"name":"PLoS Genetics","volume":"21 7","pages":"e1011769"},"PeriodicalIF":3.7000,"publicationDate":"2025-07-22","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC12310050/pdf/","citationCount":"0","resultStr":"{\"title\":\"A novel expectation-maximization approach to infer general diploid selection from time-series genetic data.\",\"authors\":\"Adam G Fine, Matthias Steinrücken\",\"doi\":\"10.1371/journal.pgen.1011769\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"Detecting and quantifying the strength of selection is a major objective in population genetics. Since selection acts over multiple generations, many approaches have been developed to detect and quantify selection using genetic data sampled at multiple points in time. Such time-series genetic data is commonly analyzed using Hidden Markov Models, but in most cases, under the assumption of additive selection. However, many examples of genetic variation exhibiting non-additive mechanisms exist, making it critical to develop methods that can characterize selection in more general scenarios. Here, we extend a previously introduced expectation-maximization algorithm for the inference of additive selection coefficients to the case of general diploid selection, in which the heterozygote and homozygote fitness are parameterized independently. We furthermore introduce a framework to identify bespoke modes of diploid selection from given data, a heuristic to account for variable population size, and a procedure for aggregating data across linked loci to increase power and robustness. Using extensive simulation studies, we find that our method accurately and efficiently estimates selection coefficients for different modes of diploid selection across a wide range of scenarios; however, power to classify the mode of selection is low unless selection is very strong. We apply our method to ancient DNA samples from Great Britain in the last 4,450 years and detect evidence for selection in six genomic regions, including the well-characterized LCT locus. Our work is the first genome-wide scan characterizing signals of general diploid selection.\",\"PeriodicalId\":49007,\"journal\":{\"name\":\"PLoS Genetics\",\"volume\":\"21 7\",\"pages\":\"e1011769\"},\"PeriodicalIF\":3.7000,\"publicationDate\":\"2025-07-22\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC12310050/pdf/\",\"citationCount\":\"0\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"PLoS Genetics\",\"FirstCategoryId\":\"99\",\"ListUrlMain\":\"https://doi.org/10.1371/journal.pgen.1011769\",\"RegionNum\":2,\"RegionCategory\":\"生物学\",\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"2025/7/1 0:00:00\",\"PubModel\":\"eCollection\",\"JCR\":\"Q1\",\"JCRName\":\"GENETICS & HEREDITY\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"PLoS Genetics","FirstCategoryId":"99","ListUrlMain":"https://doi.org/10.1371/journal.pgen.1011769","RegionNum":2,"RegionCategory":"生物学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"2025/7/1 0:00:00","PubModel":"eCollection","JCR":"Q1","JCRName":"GENETICS & HEREDITY","Score":null,"Total":0}

引用次数: 0

摘要

检测和量化选择的强度是群体遗传学的一个主要目标。由于选择是在多代中进行的，因此已经开发了许多方法来利用在多个时间点采样的遗传数据来检测和量化选择。这种时间序列遗传数据的分析通常使用隐马尔可夫模型，但在大多数情况下，是在加性选择的假设下进行的。然而，存在许多表现出非加性机制的遗传变异的例子，这使得开发能够在更一般的情况下表征选择的方法变得至关重要。在这里，我们将先前引入的用于推断加性选择系数的期望最大化算法扩展到一般二倍体选择的情况下，其中杂合子和纯合子适应度是独立参数化的。我们进一步介绍了一个框架，以确定从给定数据中定制的二倍体选择模式，一个启发式方法来解释可变种群大小，以及一个跨链接位点聚合数据以增加功率和鲁棒性的程序。通过大量的模拟研究，我们发现我们的方法准确有效地估计了不同模式的二倍体选择系数在广泛的情况下；然而，除非选择非常强，否则对选择模式进行分类的能力很低。我们将我们的方法应用于英国过去4450年的古代DNA样本，并在6个基因组区域中发现了选择的证据，包括特征良好的LCT位点。我们的工作是第一个全基因组扫描表征一般二倍体选择的信号。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

查看原文本刊更多论文

A novel expectation-maximization approach to infer general diploid selection from time-series genetic data.

Detecting and quantifying the strength of selection is a major objective in population genetics. Since selection acts over multiple generations, many approaches have been developed to detect and quantify selection using genetic data sampled at multiple points in time. Such time-series genetic data is commonly analyzed using Hidden Markov Models, but in most cases, under the assumption of additive selection. However, many examples of genetic variation exhibiting non-additive mechanisms exist, making it critical to develop methods that can characterize selection in more general scenarios. Here, we extend a previously introduced expectation-maximization algorithm for the inference of additive selection coefficients to the case of general diploid selection, in which the heterozygote and homozygote fitness are parameterized independently. We furthermore introduce a framework to identify bespoke modes of diploid selection from given data, a heuristic to account for variable population size, and a procedure for aggregating data across linked loci to increase power and robustness. Using extensive simulation studies, we find that our method accurately and efficiently estimates selection coefficients for different modes of diploid selection across a wide range of scenarios; however, power to classify the mode of selection is low unless selection is very strong. We apply our method to ancient DNA samples from Great Britain in the last 4,450 years and detect evidence for selection in six genomic regions, including the well-characterized LCT locus. Our work is the first genome-wide scan characterizing signals of general diploid selection.

求助全文

通过发布文献求助，成功后即可免费获取论文全文。去求助

来源期刊

PLoS Genetics GENETICS & HEREDITY-

自引率

2.20%

发文量

438

期刊介绍： PLOS Genetics is run by an international Editorial Board, headed by the Editors-in-Chief, Greg Barsh (HudsonAlpha Institute of Biotechnology, and Stanford University School of Medicine) and Greg Copenhaver (The University of North Carolina at Chapel Hill). Articles published in PLOS Genetics are archived in PubMed Central and cited in PubMed.