Arnab Kumar Khan, Tanushree Haldar, Arunabha Majumdar
{"title":"转录组关联研究的统一贝叶斯方法","authors":"Arnab Kumar Khan, Tanushree Haldar, Arunabha Majumdar","doi":"10.1101/2024.09.12.612639","DOIUrl":null,"url":null,"abstract":"Transcriptome-wide association study (TWAS) has shed light on molecular mechanisms by examining the roles of genes in complex disease etiology. TWAS facilitates gene expression mapping studies based on a reference panel of transcriptomic data to build a prediction model to identify expression quantitative loci (eQTLs) affecting gene expressions. These eQTLs leverage the construction of genetically regulated gene expression (GReX) in the GWAS data and a test between imputed GReX and the trait indicates gene-trait association. Such a two-step approach ignores the uncertainty of the predicted expression and can lead to reduced inference accuracy, e.g., inflated type-I error in TWAS. To circumvent a two-step approach, we develop a unified Bayesian method for TWAS, combining the two datasets simultaneously. We consider the horseshoe prior in the transcriptome data while modeling the relationship between the gene expression and local SNPs and the spike and slab prior while testing for an association between the GReX and the trait. We extend our approach to conducting a multi-ancestry TWAS, focusing on discovering genes that affect the trait in all ancestries. We have shown through simulation that our method gives better estimation accuracy for GReX effect size than other methods. In real data, applying our method to the GEUVADIS expression study and the GWAS data from the UK Biobank revealed several novel genes associated with the trait body mass index (BMI).","PeriodicalId":501161,"journal":{"name":"bioRxiv - Genomics","volume":"2 1","pages":""},"PeriodicalIF":0.0000,"publicationDate":"2024-09-17","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":"{\"title\":\"A unified Bayesian approach to transcriptome-wide association study\",\"authors\":\"Arnab Kumar Khan, Tanushree Haldar, Arunabha Majumdar\",\"doi\":\"10.1101/2024.09.12.612639\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"Transcriptome-wide association study (TWAS) has shed light on molecular mechanisms by examining the roles of genes in complex disease etiology. TWAS facilitates gene expression mapping studies based on a reference panel of transcriptomic data to build a prediction model to identify expression quantitative loci (eQTLs) affecting gene expressions. These eQTLs leverage the construction of genetically regulated gene expression (GReX) in the GWAS data and a test between imputed GReX and the trait indicates gene-trait association. Such a two-step approach ignores the uncertainty of the predicted expression and can lead to reduced inference accuracy, e.g., inflated type-I error in TWAS. To circumvent a two-step approach, we develop a unified Bayesian method for TWAS, combining the two datasets simultaneously. We consider the horseshoe prior in the transcriptome data while modeling the relationship between the gene expression and local SNPs and the spike and slab prior while testing for an association between the GReX and the trait. We extend our approach to conducting a multi-ancestry TWAS, focusing on discovering genes that affect the trait in all ancestries. We have shown through simulation that our method gives better estimation accuracy for GReX effect size than other methods. In real data, applying our method to the GEUVADIS expression study and the GWAS data from the UK Biobank revealed several novel genes associated with the trait body mass index (BMI).\",\"PeriodicalId\":501161,\"journal\":{\"name\":\"bioRxiv - Genomics\",\"volume\":\"2 1\",\"pages\":\"\"},\"PeriodicalIF\":0.0000,\"publicationDate\":\"2024-09-17\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"0\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"bioRxiv - Genomics\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/10.1101/2024.09.12.612639\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"bioRxiv - Genomics","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1101/2024.09.12.612639","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 0
摘要
全转录组关联研究(TWAS)通过研究基因在复杂疾病病因学中的作用,揭示了分子机制。TWAS 以转录组数据参考面板为基础,促进基因表达图谱研究,从而建立预测模型,确定影响基因表达的表达定量位点(eQTL)。这些 eQTLs 可利用 GWAS 数据中的基因调控基因表达(GReX)构建,并通过推算 GReX 与性状之间的检验表明基因与性状之间的关联。这种两步法忽略了预测表达的不确定性,可能导致推断准确性降低,例如 TWAS 中的 I 型误差增大。为了避免两步法,我们为 TWAS 开发了一种统一的贝叶斯方法,同时结合两个数据集。我们考虑了转录组数据中的马蹄先验,同时为基因表达和局部 SNP 之间的关系建模;还考虑了尖峰先验和板块先验,同时测试 GReX 和性状之间的关联。我们将我们的方法扩展到了多祖先 TWAS,重点是发现影响所有祖先性状的基因。我们通过模拟证明,与其他方法相比,我们的方法对 GReX 效应大小的估计精度更高。在真实数据中,将我们的方法应用于 GEUVADIS 表达研究和英国生物库的 GWAS 数据,发现了几个与体重指数(BMI)性状相关的新基因。
A unified Bayesian approach to transcriptome-wide association study
Transcriptome-wide association study (TWAS) has shed light on molecular mechanisms by examining the roles of genes in complex disease etiology. TWAS facilitates gene expression mapping studies based on a reference panel of transcriptomic data to build a prediction model to identify expression quantitative loci (eQTLs) affecting gene expressions. These eQTLs leverage the construction of genetically regulated gene expression (GReX) in the GWAS data and a test between imputed GReX and the trait indicates gene-trait association. Such a two-step approach ignores the uncertainty of the predicted expression and can lead to reduced inference accuracy, e.g., inflated type-I error in TWAS. To circumvent a two-step approach, we develop a unified Bayesian method for TWAS, combining the two datasets simultaneously. We consider the horseshoe prior in the transcriptome data while modeling the relationship between the gene expression and local SNPs and the spike and slab prior while testing for an association between the GReX and the trait. We extend our approach to conducting a multi-ancestry TWAS, focusing on discovering genes that affect the trait in all ancestries. We have shown through simulation that our method gives better estimation accuracy for GReX effect size than other methods. In real data, applying our method to the GEUVADIS expression study and the GWAS data from the UK Biobank revealed several novel genes associated with the trait body mass index (BMI).