Integrating Whole Genome and Transcriptome Sequencing to Characterize the Genetic Architecture of Isoform Variation and its Implications for Health and Disease.
Chunyu Liu, Roby Joehanes, Jiantao Ma, Jiuyong Xie, Jian Yang, Mengyao Wang, Tianxiao Huan, Shih-Jen Hwang, Jia Wen, Quan Sun, Demirkale Y Cumhur, Nancy L Heard-Costa, Peter Orchard, April P Carson, Laura M Raffield, Alexander Reiner, Yun Li, George O'Connor, Joanne M Murabito, Peter Munson, Daniel Levy
{"title":"Integrating Whole Genome and Transcriptome Sequencing to Characterize the Genetic Architecture of Isoform Variation and its Implications for Health and Disease.","authors":"Chunyu Liu, Roby Joehanes, Jiantao Ma, Jiuyong Xie, Jian Yang, Mengyao Wang, Tianxiao Huan, Shih-Jen Hwang, Jia Wen, Quan Sun, Demirkale Y Cumhur, Nancy L Heard-Costa, Peter Orchard, April P Carson, Laura M Raffield, Alexander Reiner, Yun Li, George O'Connor, Joanne M Murabito, Peter Munson, Daniel Levy","doi":"10.1101/2024.12.04.24318434","DOIUrl":null,"url":null,"abstract":"<p><p>We created a comprehensive whole blood splice variation quantitative trait locus (sQTL) resource by analyzing isoform expression ratio (isoform-to-gene) in Framingham Heart Study (FHS) participants (discovery: n=2,622; validation: n=1,094) with whole genome (WGS) and transcriptome sequencing (RNA-seq) data. External replication was conducted using WGS and RNA-seq from the Jackson Heart Study (JHS, n=1,020). We identified over 3.5 million <i>cis</i> -sQTL-isoform pairs ( <i>p</i> <5e-8), comprising 1,176,624 <i>cis</i> -sQTL variants and 10,883 isoform transcripts from 4,971 sGenes, with significant change in isoform-to-gene ratio due to allelic variation. We validated 61% of these pairs in the FHS validation sample ( <i>p</i> <1e-4). External validation ( <i>p</i> <1e-4) in JHS for the top 10,000 and 100,000 most significant <i>cis</i> -sQTL-isoform pairs was 88% and 69%, respectively, while overall pairs validated at 23%. For 20% of <i>cis</i> -sQTLs in the FHS discovery sample, allelic variation did not significantly correlate with overall gene expression. sQTLs are enriched in splice donor and acceptor sites, as well as in GWAS SNPs, methylation QTLs, and protein QTLs. We detailed several sentinel <i>cis</i> -sQTLs influencing alternative splicing, with potential causal effects on cardiovascular disease risk. Notably, rs12898397 (T>C) affects splicing of <i>ULK3</i> , lowering levels of the full-length transcript ENST00000440863.7 and increasing levels of the truncated transcript ENST00000569437.5, encoding proteins of different lengths. Mendelian randomization analysis demonstrated that a lower ratio of the full-length isoform is causally associated with lower diastolic blood pressure and reduced lymphocyte percentages. This sQTL resource provides valuable insights into how transcriptomic variation may influence health outcomes.</p>","PeriodicalId":94281,"journal":{"name":"medRxiv : the preprint server for health sciences","volume":" ","pages":""},"PeriodicalIF":0.0000,"publicationDate":"2024-12-06","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC11643148/pdf/","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"medRxiv : the preprint server for health sciences","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1101/2024.12.04.24318434","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 0
Abstract
We created a comprehensive whole blood splice variation quantitative trait locus (sQTL) resource by analyzing isoform expression ratio (isoform-to-gene) in Framingham Heart Study (FHS) participants (discovery: n=2,622; validation: n=1,094) with whole genome (WGS) and transcriptome sequencing (RNA-seq) data. External replication was conducted using WGS and RNA-seq from the Jackson Heart Study (JHS, n=1,020). We identified over 3.5 million cis -sQTL-isoform pairs ( p <5e-8), comprising 1,176,624 cis -sQTL variants and 10,883 isoform transcripts from 4,971 sGenes, with significant change in isoform-to-gene ratio due to allelic variation. We validated 61% of these pairs in the FHS validation sample ( p <1e-4). External validation ( p <1e-4) in JHS for the top 10,000 and 100,000 most significant cis -sQTL-isoform pairs was 88% and 69%, respectively, while overall pairs validated at 23%. For 20% of cis -sQTLs in the FHS discovery sample, allelic variation did not significantly correlate with overall gene expression. sQTLs are enriched in splice donor and acceptor sites, as well as in GWAS SNPs, methylation QTLs, and protein QTLs. We detailed several sentinel cis -sQTLs influencing alternative splicing, with potential causal effects on cardiovascular disease risk. Notably, rs12898397 (T>C) affects splicing of ULK3 , lowering levels of the full-length transcript ENST00000440863.7 and increasing levels of the truncated transcript ENST00000569437.5, encoding proteins of different lengths. Mendelian randomization analysis demonstrated that a lower ratio of the full-length isoform is causally associated with lower diastolic blood pressure and reduced lymphocyte percentages. This sQTL resource provides valuable insights into how transcriptomic variation may influence health outcomes.