Estimating evolutionary and demographic parameters via ARG-derived IBD.

IF 4 2区 生物学 Q1 GENETICS & HEREDITY
PLoS Genetics Pub Date : 2025-01-08 eCollection Date: 2025-01-01 DOI:10.1371/journal.pgen.1011537
Zhendong Huang, Jerome Kelleher, Yao-Ban Chan, David Balding
{"title":"Estimating evolutionary and demographic parameters via ARG-derived IBD.","authors":"Zhendong Huang, Jerome Kelleher, Yao-Ban Chan, David Balding","doi":"10.1371/journal.pgen.1011537","DOIUrl":null,"url":null,"abstract":"<p><p>Inference of evolutionary and demographic parameters from a sample of genome sequences often proceeds by first inferring identical-by-descent (IBD) genome segments. By exploiting efficient data encoding based on the ancestral recombination graph (ARG), we obtain three major advantages over current approaches: (i) no need to impose a length threshold on IBD segments, (ii) IBD can be defined without the hard-to-verify requirement of no recombination, and (iii) computation time can be reduced with little loss of statistical efficiency using only the IBD segments from a set of sequence pairs that scales linearly with sample size. We first demonstrate powerful inferences when true IBD information is available from simulated data. For IBD inferred from real data, we propose an approximate Bayesian computation inference algorithm and use it to show that even poorly-inferred short IBD segments can improve estimation. Our mutation-rate estimator achieves precision similar to a previously-published method despite a 4 000-fold reduction in data used for inference, and we identify significant differences between human populations. Computational cost limits model complexity in our approach, but we are able to incorporate unknown nuisance parameters and model misspecification, still finding improved parameter inference.</p>","PeriodicalId":49007,"journal":{"name":"PLoS Genetics","volume":"21 1","pages":"e1011537"},"PeriodicalIF":4.0000,"publicationDate":"2025-01-08","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC11750106/pdf/","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"PLoS Genetics","FirstCategoryId":"99","ListUrlMain":"https://doi.org/10.1371/journal.pgen.1011537","RegionNum":2,"RegionCategory":"生物学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"2025/1/1 0:00:00","PubModel":"eCollection","JCR":"Q1","JCRName":"GENETICS & HEREDITY","Score":null,"Total":0}
引用次数: 0

Abstract

Inference of evolutionary and demographic parameters from a sample of genome sequences often proceeds by first inferring identical-by-descent (IBD) genome segments. By exploiting efficient data encoding based on the ancestral recombination graph (ARG), we obtain three major advantages over current approaches: (i) no need to impose a length threshold on IBD segments, (ii) IBD can be defined without the hard-to-verify requirement of no recombination, and (iii) computation time can be reduced with little loss of statistical efficiency using only the IBD segments from a set of sequence pairs that scales linearly with sample size. We first demonstrate powerful inferences when true IBD information is available from simulated data. For IBD inferred from real data, we propose an approximate Bayesian computation inference algorithm and use it to show that even poorly-inferred short IBD segments can improve estimation. Our mutation-rate estimator achieves precision similar to a previously-published method despite a 4 000-fold reduction in data used for inference, and we identify significant differences between human populations. Computational cost limits model complexity in our approach, but we are able to incorporate unknown nuisance parameters and model misspecification, still finding improved parameter inference.

通过arg衍生的IBD估计进化和人口参数。
从基因组序列样本推断进化和人口统计学参数通常首先推断同源(IBD)基因组片段。通过利用基于祖先重组图(ARG)的高效数据编码,我们获得了与现有方法相比的三个主要优势:(i)不需要对IBD片段施加长度阈值,(ii) IBD可以在没有重组的难以验证的要求下定义,以及(iii)仅使用与样本量线性扩展的序列对集合中的IBD片段可以减少计算时间,而统计效率几乎没有损失。当从模拟数据中获得真实的IBD信息时,我们首先展示了强有力的推断。对于从真实数据中推断出的IBD,我们提出了一种近似贝叶斯计算推理算法,并使用它来证明即使推断较差的IBD短段也可以提高估计。尽管用于推断的数据减少了4000倍,但我们的突变率估计器实现了与先前发表的方法相似的精度,并且我们确定了人群之间的显着差异。在我们的方法中,计算成本限制了模型的复杂性,但我们能够纳入未知的干扰参数和模型错误规范,仍然可以找到改进的参数推理。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
求助全文
约1分钟内获得全文 求助全文
来源期刊
PLoS Genetics
PLoS Genetics GENETICS & HEREDITY-
自引率
2.20%
发文量
438
期刊介绍: PLOS Genetics is run by an international Editorial Board, headed by the Editors-in-Chief, Greg Barsh (HudsonAlpha Institute of Biotechnology, and Stanford University School of Medicine) and Greg Copenhaver (The University of North Carolina at Chapel Hill). Articles published in PLOS Genetics are archived in PubMed Central and cited in PubMed.
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
确定
请完成安全验证×
copy
已复制链接
快去分享给好友吧!
我知道了
右上角分享
点击右上角分享
0
联系我们:info@booksci.cn Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。 Copyright © 2023 布克学术 All rights reserved.
京ICP备2023020795号-1
ghs 京公网安备 11010802042870号
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术官方微信