进化精馏用于全基因组关联研究。

IF 5.1 3区 生物学 Q2 GENETICS & HEREDITY
Genetics Pub Date : 2025-08-07 DOI:10.1093/genetics/iyaf158
Ryan Christ, Xinxin Wang, Louis J M Aslett, David Steinsaltz, Ira Hall
{"title":"进化精馏用于全基因组关联研究。","authors":"Ryan Christ, Xinxin Wang, Louis J M Aslett, David Steinsaltz, Ira Hall","doi":"10.1093/genetics/iyaf158","DOIUrl":null,"url":null,"abstract":"<p><p>Testing inferred haplotype genealogies for association with phenotypes has been a longstanding goal in human genetics given their potential to detect association signals driven by allelic heterogeneity - when multiple causal variants modulate a phenotype - in both coding and noncoding regions. Recent scalable methods for inferring locus-specific genealogical trees along the genome, or representations thereof, have made substantial progress towards this goal; however, the problem of testing these trees for association with phenotypes has remained unsolved due to the growth in the number of clades with increasing sample size. To address this issue, we introduce several practical improvements to the kalis ancestry inference engine, including a general optimal checkpointing algorithm for decoding hidden Markov models, thereby enabling efficient genome-wide analyses. We then propose LOCATER, a powerful new procedure based on the recently proposed Stable Distillation framework, to test local tree representations for trait association. Although LOCATER is demonstrated here in conjunction with kalis, it may be used for testing output from any ancestry inference engine, regardless of whether such engines return discrete tree structures, relatedness matrices, or some combination of the two at each locus. Using simulated quantitative phenotypes, our results indicate that LOCATER achieves substantial power gains over traditional single marker testing, ARG-Needle, and window-based testing in cases of allelic heterogeneity, while also improving causal region localization. These findings suggest that genealogy-based association testing will be a fruitful approach for gene discovery, especially for signals driven by multiple ultra-rare variants.</p>","PeriodicalId":48925,"journal":{"name":"Genetics","volume":" ","pages":""},"PeriodicalIF":5.1000,"publicationDate":"2025-08-07","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":"{\"title\":\"Clade Distillation for Genome-wide Association Studies.\",\"authors\":\"Ryan Christ, Xinxin Wang, Louis J M Aslett, David Steinsaltz, Ira Hall\",\"doi\":\"10.1093/genetics/iyaf158\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"<p><p>Testing inferred haplotype genealogies for association with phenotypes has been a longstanding goal in human genetics given their potential to detect association signals driven by allelic heterogeneity - when multiple causal variants modulate a phenotype - in both coding and noncoding regions. Recent scalable methods for inferring locus-specific genealogical trees along the genome, or representations thereof, have made substantial progress towards this goal; however, the problem of testing these trees for association with phenotypes has remained unsolved due to the growth in the number of clades with increasing sample size. To address this issue, we introduce several practical improvements to the kalis ancestry inference engine, including a general optimal checkpointing algorithm for decoding hidden Markov models, thereby enabling efficient genome-wide analyses. We then propose LOCATER, a powerful new procedure based on the recently proposed Stable Distillation framework, to test local tree representations for trait association. Although LOCATER is demonstrated here in conjunction with kalis, it may be used for testing output from any ancestry inference engine, regardless of whether such engines return discrete tree structures, relatedness matrices, or some combination of the two at each locus. Using simulated quantitative phenotypes, our results indicate that LOCATER achieves substantial power gains over traditional single marker testing, ARG-Needle, and window-based testing in cases of allelic heterogeneity, while also improving causal region localization. These findings suggest that genealogy-based association testing will be a fruitful approach for gene discovery, especially for signals driven by multiple ultra-rare variants.</p>\",\"PeriodicalId\":48925,\"journal\":{\"name\":\"Genetics\",\"volume\":\" \",\"pages\":\"\"},\"PeriodicalIF\":5.1000,\"publicationDate\":\"2025-08-07\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"0\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"Genetics\",\"FirstCategoryId\":\"99\",\"ListUrlMain\":\"https://doi.org/10.1093/genetics/iyaf158\",\"RegionNum\":3,\"RegionCategory\":\"生物学\",\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"Q2\",\"JCRName\":\"GENETICS & HEREDITY\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"Genetics","FirstCategoryId":"99","ListUrlMain":"https://doi.org/10.1093/genetics/iyaf158","RegionNum":3,"RegionCategory":"生物学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q2","JCRName":"GENETICS & HEREDITY","Score":null,"Total":0}
引用次数: 0

摘要

检测推断的单倍型家谱与表型的关联一直是人类遗传学的长期目标,因为它们有潜力检测由等位基因异质性驱动的关联信号——当多个因果变异调节表型时——在编码区和非编码区。最近用于推断基因座特异性谱系树的可扩展方法,或其表示,已经朝着这一目标取得了实质性进展;然而,由于随着样本量的增加,进化枝数量的增加,测试这些树与表型关联的问题仍未解决。为了解决这个问题,我们对kalis祖先推理引擎进行了一些实际的改进,包括用于解码隐马尔可夫模型的通用最优检查点算法,从而实现了高效的全基因组分析。然后,我们提出了LOCATER,一个基于最近提出的稳定蒸馏框架的强大的新过程,用于测试特征关联的局部树表示。尽管LOCATER在这里是与kalis一起演示的,但它可以用于测试来自任何祖先推理引擎的输出,而不管这些引擎是否返回离散树结构、相关性矩阵,或者在每个位点返回两者的某种组合。通过模拟定量表型,我们的研究结果表明,在等位基因异质性的情况下,LOCATER比传统的单标记测试、ARG-Needle和基于窗口的测试取得了显著的优势,同时也改善了因果区域定位。这些发现表明,基于家谱的关联检测将是一种卓有成效的基因发现方法,特别是对于由多个超罕见变异驱动的信号。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
Clade Distillation for Genome-wide Association Studies.

Testing inferred haplotype genealogies for association with phenotypes has been a longstanding goal in human genetics given their potential to detect association signals driven by allelic heterogeneity - when multiple causal variants modulate a phenotype - in both coding and noncoding regions. Recent scalable methods for inferring locus-specific genealogical trees along the genome, or representations thereof, have made substantial progress towards this goal; however, the problem of testing these trees for association with phenotypes has remained unsolved due to the growth in the number of clades with increasing sample size. To address this issue, we introduce several practical improvements to the kalis ancestry inference engine, including a general optimal checkpointing algorithm for decoding hidden Markov models, thereby enabling efficient genome-wide analyses. We then propose LOCATER, a powerful new procedure based on the recently proposed Stable Distillation framework, to test local tree representations for trait association. Although LOCATER is demonstrated here in conjunction with kalis, it may be used for testing output from any ancestry inference engine, regardless of whether such engines return discrete tree structures, relatedness matrices, or some combination of the two at each locus. Using simulated quantitative phenotypes, our results indicate that LOCATER achieves substantial power gains over traditional single marker testing, ARG-Needle, and window-based testing in cases of allelic heterogeneity, while also improving causal region localization. These findings suggest that genealogy-based association testing will be a fruitful approach for gene discovery, especially for signals driven by multiple ultra-rare variants.

求助全文
通过发布文献求助,成功后即可免费获取论文全文。 去求助
来源期刊
Genetics
Genetics GENETICS & HEREDITY-
CiteScore
6.90
自引率
6.10%
发文量
177
审稿时长
1.5 months
期刊介绍: GENETICS is published by the Genetics Society of America, a scholarly society that seeks to deepen our understanding of the living world by advancing our understanding of genetics. Since 1916, GENETICS has published high-quality, original research presenting novel findings bearing on genetics and genomics. The journal publishes empirical studies of organisms ranging from microbes to humans, as well as theoretical work. While it has an illustrious history, GENETICS has changed along with the communities it serves: it is not your mentor''s journal. The editors make decisions quickly – in around 30 days – without sacrificing the excellence and scholarship for which the journal has long been known. GENETICS is a peer reviewed, peer-edited journal, with an international reach and increasing visibility and impact. All editorial decisions are made through collaboration of at least two editors who are practicing scientists. GENETICS is constantly innovating: expanded types of content include Reviews, Commentary (current issues of interest to geneticists), Perspectives (historical), Primers (to introduce primary literature into the classroom), Toolbox Reviews, plus YeastBook, FlyBook, and WormBook (coming spring 2016). For particularly time-sensitive results, we publish Communications. As part of our mission to serve our communities, we''ve published thematic collections, including Genomic Selection, Multiparental Populations, Mouse Collaborative Cross, and the Genetics of Sex.
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
确定
请完成安全验证×
copy
已复制链接
快去分享给好友吧!
我知道了
右上角分享
点击右上角分享
0
联系我们:info@booksci.cn Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。 Copyright © 2023 布克学术 All rights reserved.
京ICP备2023020795号-1
ghs 京公网安备 11010802042870号
Book学术文献互助
Book学术文献互助群
群 号:604180095
Book学术官方微信