CADET:使用eQTL汇总数据在混合样品中增强转录组全关联分析

IF 8.1 1区 生物学 Q1 GENETICS & HEREDITY
S. Taylor Head, Qile Dai, Joellen Schildkraut, David J. Cutler, Jingjing Yang, Michael P. Epstein
{"title":"CADET:使用eQTL汇总数据在混合样品中增强转录组全关联分析","authors":"S. Taylor Head, Qile Dai, Joellen Schildkraut, David J. Cutler, Jingjing Yang, Michael P. Epstein","doi":"10.1016/j.ajhg.2025.05.010","DOIUrl":null,"url":null,"abstract":"A transcriptome-wide association study (TWAS) is a popular statistical method for identifying genes whose genetically regulated expression (GReX) component is associated with a trait of interest. Most TWAS approaches fundamentally assume that the training dataset (used to fit the gene expression prediction model) and target genome-wide association study (GWAS) dataset are from the same ancestrally homogeneous population. If this assumption is violated, studies have shown a marked negative impact on expression prediction accuracy as well as reduced power of the downstream gene-trait association test. These issues pose a particular problem for admixed individuals whose genomes represent a mosaic of multiple continental ancestral segments. To resolve these issues, we present CADET, which enables powerful TWAS of admixed cohorts leveraging the local-ancestry (LA) information of the cohort along with summary-level expression quantitative trait locus (eQTL) data from reference panels of different ancestral groups. CADET combines multiple polygenic risk score models based on the summary-level eQTL reference data to predict LA-aware GReX components in admixed target samples. Using simulated data, we compare the imputation accuracy, power, and type I error rate of our proposed LA-aware approach to LA-unaware methods for performing TWASs. We show that CADET performs optimally in nearly all settings regardless of whether the genetic architecture of gene expression is dependent or independent of ancestry. We further illustrate CADET by performing a TWAS of 29 common blood biochemistry phenotypes within an admixed cohort from the UK Biobank and identify 18 hits unique to our LA-aware strategy, with the majority of hits supported by existing GWAS findings.","PeriodicalId":7659,"journal":{"name":"American journal of human genetics","volume":"11 1","pages":""},"PeriodicalIF":8.1000,"publicationDate":"2025-06-13","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":"{\"title\":\"CADET: Enhanced transcriptome-wide association analyses in admixed samples using eQTL summary data\",\"authors\":\"S. Taylor Head, Qile Dai, Joellen Schildkraut, David J. Cutler, Jingjing Yang, Michael P. Epstein\",\"doi\":\"10.1016/j.ajhg.2025.05.010\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"A transcriptome-wide association study (TWAS) is a popular statistical method for identifying genes whose genetically regulated expression (GReX) component is associated with a trait of interest. Most TWAS approaches fundamentally assume that the training dataset (used to fit the gene expression prediction model) and target genome-wide association study (GWAS) dataset are from the same ancestrally homogeneous population. If this assumption is violated, studies have shown a marked negative impact on expression prediction accuracy as well as reduced power of the downstream gene-trait association test. These issues pose a particular problem for admixed individuals whose genomes represent a mosaic of multiple continental ancestral segments. To resolve these issues, we present CADET, which enables powerful TWAS of admixed cohorts leveraging the local-ancestry (LA) information of the cohort along with summary-level expression quantitative trait locus (eQTL) data from reference panels of different ancestral groups. CADET combines multiple polygenic risk score models based on the summary-level eQTL reference data to predict LA-aware GReX components in admixed target samples. Using simulated data, we compare the imputation accuracy, power, and type I error rate of our proposed LA-aware approach to LA-unaware methods for performing TWASs. We show that CADET performs optimally in nearly all settings regardless of whether the genetic architecture of gene expression is dependent or independent of ancestry. We further illustrate CADET by performing a TWAS of 29 common blood biochemistry phenotypes within an admixed cohort from the UK Biobank and identify 18 hits unique to our LA-aware strategy, with the majority of hits supported by existing GWAS findings.\",\"PeriodicalId\":7659,\"journal\":{\"name\":\"American journal of human genetics\",\"volume\":\"11 1\",\"pages\":\"\"},\"PeriodicalIF\":8.1000,\"publicationDate\":\"2025-06-13\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"0\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"American journal of human genetics\",\"FirstCategoryId\":\"99\",\"ListUrlMain\":\"https://doi.org/10.1016/j.ajhg.2025.05.010\",\"RegionNum\":1,\"RegionCategory\":\"生物学\",\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"Q1\",\"JCRName\":\"GENETICS & HEREDITY\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"American journal of human genetics","FirstCategoryId":"99","ListUrlMain":"https://doi.org/10.1016/j.ajhg.2025.05.010","RegionNum":1,"RegionCategory":"生物学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q1","JCRName":"GENETICS & HEREDITY","Score":null,"Total":0}
引用次数: 0

摘要

转录组全关联研究(TWAS)是一种流行的统计方法,用于鉴定基因调控表达(GReX)成分与感兴趣的性状相关的基因。大多数TWAS方法从根本上假设训练数据集(用于拟合基因表达预测模型)和目标全基因组关联研究(GWAS)数据集来自相同的祖先同质群体。如果违反这一假设,研究表明,对表达预测的准确性有明显的负面影响,并降低下游基因-性状关联测试的能力。这些问题给混血个体带来了一个特殊的问题,他们的基因组代表了多个大陆祖先片段的马赛克。为了解决这些问题,我们提出了CADET,它利用队列的本地祖先(LA)信息以及来自不同祖先群体参考面板的摘要水平表达数量性状位点(eQTL)数据,实现了混合队列的强大TWAS。CADET结合基于汇总级eQTL参考数据的多个多基因风险评分模型,预测混合目标样本中la感知的GReX成分。使用模拟数据,我们比较了我们提出的la感知方法和la不感知方法的输入精度、功率和I型错误率。我们发现,无论基因表达的遗传结构是依赖还是独立于祖先,CADET在几乎所有环境下都表现最佳。我们通过对来自UK Biobank的混合队列的29种常见血液生化表型进行TWAS进一步说明了CADET,并确定了我们的la感知策略独有的18个命中,其中大多数命中得到了现有GWAS发现的支持。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
CADET: Enhanced transcriptome-wide association analyses in admixed samples using eQTL summary data
A transcriptome-wide association study (TWAS) is a popular statistical method for identifying genes whose genetically regulated expression (GReX) component is associated with a trait of interest. Most TWAS approaches fundamentally assume that the training dataset (used to fit the gene expression prediction model) and target genome-wide association study (GWAS) dataset are from the same ancestrally homogeneous population. If this assumption is violated, studies have shown a marked negative impact on expression prediction accuracy as well as reduced power of the downstream gene-trait association test. These issues pose a particular problem for admixed individuals whose genomes represent a mosaic of multiple continental ancestral segments. To resolve these issues, we present CADET, which enables powerful TWAS of admixed cohorts leveraging the local-ancestry (LA) information of the cohort along with summary-level expression quantitative trait locus (eQTL) data from reference panels of different ancestral groups. CADET combines multiple polygenic risk score models based on the summary-level eQTL reference data to predict LA-aware GReX components in admixed target samples. Using simulated data, we compare the imputation accuracy, power, and type I error rate of our proposed LA-aware approach to LA-unaware methods for performing TWASs. We show that CADET performs optimally in nearly all settings regardless of whether the genetic architecture of gene expression is dependent or independent of ancestry. We further illustrate CADET by performing a TWAS of 29 common blood biochemistry phenotypes within an admixed cohort from the UK Biobank and identify 18 hits unique to our LA-aware strategy, with the majority of hits supported by existing GWAS findings.
求助全文
通过发布文献求助,成功后即可免费获取论文全文。 去求助
来源期刊
CiteScore
14.70
自引率
4.10%
发文量
185
审稿时长
1 months
期刊介绍: The American Journal of Human Genetics (AJHG) is a monthly journal published by Cell Press, chosen by The American Society of Human Genetics (ASHG) as its premier publication starting from January 2008. AJHG represents Cell Press's first society-owned journal, and both ASHG and Cell Press anticipate significant synergies between AJHG content and that of other Cell Press titles.
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
确定
请完成安全验证×
copy
已复制链接
快去分享给好友吧!
我知道了
右上角分享
点击右上角分享
0
联系我们:info@booksci.cn Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。 Copyright © 2023 布克学术 All rights reserved.
京ICP备2023020795号-1
ghs 京公网安备 11010802042870号
Book学术文献互助
Book学术文献互助群
群 号:604180095
Book学术官方微信