关于arg,系谱和遗传亲缘矩阵。

IF 5.1 3区 生物学 Q2 GENETICS & HEREDITY
Genetics Pub Date : 2025-10-08 DOI:10.1093/genetics/iyaf219
Brieuc Lehmann, Hanbin Lee, Luke Anderson-Trocmé, Jerome Kelleher, Gregor Gorjanc, Peter L Ralph
{"title":"关于arg,系谱和遗传亲缘矩阵。","authors":"Brieuc Lehmann, Hanbin Lee, Luke Anderson-Trocmé, Jerome Kelleher, Gregor Gorjanc, Peter L Ralph","doi":"10.1093/genetics/iyaf219","DOIUrl":null,"url":null,"abstract":"<p><p>Genetic relatedness is a central concept in genetics, underpinning studies of population and quantitative genetics in human, animal, and plant settings. It is typically stored as a genetic relatedness matrix (GRM), whose elements are pairwise relatedness values between individuals. This relatedness has been defined in various contexts based on pedigree, genotype, phylogeny, coalescent times, and, recently, ancestral recombination graph (ARG). For some downstream applications, including association studies, using ARG-based GRMs has led to better performance relative to the genotype GRM. However, they present computational challenges due to their inherent quadratic time and space complexity. Here, we first discuss the different definitions of relatedness in a unifying context, making use of the additive model of a quantitative trait to provide a definition of ``branch relatedness'' and the corresponding ``branch GRM''. We explore the relationship between branch relatedness and pedigree relatedness (i.e., kinship) through a case study of French-Canadian individuals that have a known pedigree. Through the tree sequence encoding of an ARG, we then derive an efficient algorithm for computing products between the branch GRM and a general vector, without explicitly forming the branch GRM. This algorithm leverages the sparse encoding of genomes with the tree sequence and hence enables large-scale computations with the branch GRM. We demonstrate the power of this algorithm by developing a randomized principal components algorithm for tree sequences that easily scales to millions of genomes. All algorithms are implemented in the open source tskit Python package. Taken together, this work consolidates the different notions of relatedness as branch relatedness and by leveraging the tree sequence encoding of an ARG it provides efficient algorithms that enable computations with the branch GRM that scale to mega-scale genomic datasets.</p>","PeriodicalId":48925,"journal":{"name":"Genetics","volume":" ","pages":""},"PeriodicalIF":5.1000,"publicationDate":"2025-10-08","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":"{\"title\":\"On ARGs, pedigrees, and genetic relatedness matrices.\",\"authors\":\"Brieuc Lehmann, Hanbin Lee, Luke Anderson-Trocmé, Jerome Kelleher, Gregor Gorjanc, Peter L Ralph\",\"doi\":\"10.1093/genetics/iyaf219\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"<p><p>Genetic relatedness is a central concept in genetics, underpinning studies of population and quantitative genetics in human, animal, and plant settings. It is typically stored as a genetic relatedness matrix (GRM), whose elements are pairwise relatedness values between individuals. This relatedness has been defined in various contexts based on pedigree, genotype, phylogeny, coalescent times, and, recently, ancestral recombination graph (ARG). For some downstream applications, including association studies, using ARG-based GRMs has led to better performance relative to the genotype GRM. However, they present computational challenges due to their inherent quadratic time and space complexity. Here, we first discuss the different definitions of relatedness in a unifying context, making use of the additive model of a quantitative trait to provide a definition of ``branch relatedness'' and the corresponding ``branch GRM''. We explore the relationship between branch relatedness and pedigree relatedness (i.e., kinship) through a case study of French-Canadian individuals that have a known pedigree. Through the tree sequence encoding of an ARG, we then derive an efficient algorithm for computing products between the branch GRM and a general vector, without explicitly forming the branch GRM. This algorithm leverages the sparse encoding of genomes with the tree sequence and hence enables large-scale computations with the branch GRM. We demonstrate the power of this algorithm by developing a randomized principal components algorithm for tree sequences that easily scales to millions of genomes. All algorithms are implemented in the open source tskit Python package. Taken together, this work consolidates the different notions of relatedness as branch relatedness and by leveraging the tree sequence encoding of an ARG it provides efficient algorithms that enable computations with the branch GRM that scale to mega-scale genomic datasets.</p>\",\"PeriodicalId\":48925,\"journal\":{\"name\":\"Genetics\",\"volume\":\" \",\"pages\":\"\"},\"PeriodicalIF\":5.1000,\"publicationDate\":\"2025-10-08\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"0\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"Genetics\",\"FirstCategoryId\":\"99\",\"ListUrlMain\":\"https://doi.org/10.1093/genetics/iyaf219\",\"RegionNum\":3,\"RegionCategory\":\"生物学\",\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"Q2\",\"JCRName\":\"GENETICS & HEREDITY\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"Genetics","FirstCategoryId":"99","ListUrlMain":"https://doi.org/10.1093/genetics/iyaf219","RegionNum":3,"RegionCategory":"生物学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q2","JCRName":"GENETICS & HEREDITY","Score":null,"Total":0}
引用次数: 0

摘要

遗传亲缘关系是遗传学的核心概念,是人类、动物和植物种群和数量遗传学研究的基础。它通常以遗传相关性矩阵(GRM)的形式存储,其元素是个体之间的成对相关性值。这种亲缘关系在各种背景下被定义为基于谱系、基因型、系统发育、聚代时间,以及最近的祖先重组图(ARG)。对于一些下游应用,包括关联研究,使用基于arg的GRM相对于基因型GRM具有更好的性能。然而,由于其固有的二次时间和空间复杂性,它们提出了计算挑战。在本文中,我们首先在统一的背景下讨论了相关性的不同定义,利用数量性状的可加性模型给出了“分支相关性”和相应的“分支GRM”的定义。我们通过对具有已知谱系的法裔加拿大人的案例研究,探讨了分支亲缘关系和谱系亲缘关系(即亲属关系)之间的关系。通过ARG的树序列编码,在不显式形成分支GRM的情况下,推导出分支GRM与一般向量乘积的高效算法。该算法利用了基因组与树序列的稀疏编码,因此可以使用分支GRM进行大规模计算。我们通过开发一种随机主成分算法来证明该算法的强大功能,该算法可以很容易地扩展到数百万个基因组的树序列。所有算法都在开源的tskit Python包中实现。综上所述,这项工作将不同的相关性概念整合为分支相关性,并通过利用ARG的树序列编码,提供了有效的算法,使分支GRM的计算能够扩展到超大规模的基因组数据集。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
On ARGs, pedigrees, and genetic relatedness matrices.

Genetic relatedness is a central concept in genetics, underpinning studies of population and quantitative genetics in human, animal, and plant settings. It is typically stored as a genetic relatedness matrix (GRM), whose elements are pairwise relatedness values between individuals. This relatedness has been defined in various contexts based on pedigree, genotype, phylogeny, coalescent times, and, recently, ancestral recombination graph (ARG). For some downstream applications, including association studies, using ARG-based GRMs has led to better performance relative to the genotype GRM. However, they present computational challenges due to their inherent quadratic time and space complexity. Here, we first discuss the different definitions of relatedness in a unifying context, making use of the additive model of a quantitative trait to provide a definition of ``branch relatedness'' and the corresponding ``branch GRM''. We explore the relationship between branch relatedness and pedigree relatedness (i.e., kinship) through a case study of French-Canadian individuals that have a known pedigree. Through the tree sequence encoding of an ARG, we then derive an efficient algorithm for computing products between the branch GRM and a general vector, without explicitly forming the branch GRM. This algorithm leverages the sparse encoding of genomes with the tree sequence and hence enables large-scale computations with the branch GRM. We demonstrate the power of this algorithm by developing a randomized principal components algorithm for tree sequences that easily scales to millions of genomes. All algorithms are implemented in the open source tskit Python package. Taken together, this work consolidates the different notions of relatedness as branch relatedness and by leveraging the tree sequence encoding of an ARG it provides efficient algorithms that enable computations with the branch GRM that scale to mega-scale genomic datasets.

求助全文
通过发布文献求助,成功后即可免费获取论文全文。 去求助
来源期刊
Genetics
Genetics GENETICS & HEREDITY-
CiteScore
6.90
自引率
6.10%
发文量
177
审稿时长
1.5 months
期刊介绍: GENETICS is published by the Genetics Society of America, a scholarly society that seeks to deepen our understanding of the living world by advancing our understanding of genetics. Since 1916, GENETICS has published high-quality, original research presenting novel findings bearing on genetics and genomics. The journal publishes empirical studies of organisms ranging from microbes to humans, as well as theoretical work. While it has an illustrious history, GENETICS has changed along with the communities it serves: it is not your mentor''s journal. The editors make decisions quickly – in around 30 days – without sacrificing the excellence and scholarship for which the journal has long been known. GENETICS is a peer reviewed, peer-edited journal, with an international reach and increasing visibility and impact. All editorial decisions are made through collaboration of at least two editors who are practicing scientists. GENETICS is constantly innovating: expanded types of content include Reviews, Commentary (current issues of interest to geneticists), Perspectives (historical), Primers (to introduce primary literature into the classroom), Toolbox Reviews, plus YeastBook, FlyBook, and WormBook (coming spring 2016). For particularly time-sensitive results, we publish Communications. As part of our mission to serve our communities, we''ve published thematic collections, including Genomic Selection, Multiparental Populations, Mouse Collaborative Cross, and the Genetics of Sex.
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
确定
请完成安全验证×
copy
已复制链接
快去分享给好友吧!
我知道了
右上角分享
点击右上角分享
0
联系我们:info@booksci.cn Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。 Copyright © 2023 布克学术 All rights reserved.
京ICP备2023020795号-1
ghs 京公网安备 11010802042870号
Book学术文献互助
Book学术文献互助群
群 号:604180095
Book学术官方微信