{"title":"Integrating Proteomics and GWAS to Identify Key Tissues and Genes Underlying Human Complex Diseases.","authors":"Chao Xue, Miao Zhou","doi":"10.3390/biology14050554","DOIUrl":null,"url":null,"abstract":"<p><strong>Background: </strong>The tissues of origin and molecular mechanisms underlying human complex diseases remain incompletely understood. Previous studies have leveraged transcriptomic data to interpret genome-wide association studies (GWASs) for identifying disease-relevant tissues and fine-mapping causal genes. However, according to the central dogma, proteins more directly reflect cellular molecular activities than RNA. Therefore, in this study, we integrated proteomic data with GWAS to identify disease-associated tissues and genes.</p><p><strong>Methods: </strong>We compiled proteomic and paired transcriptomic data for 12,229 genes across 32 human tissues from the GTEx project. Using three tissue inference approaches-S-LDSC, MAGMA, and DESE-we analyzed GWAS data for six representative complex diseases (bipolar disorder, schizophrenia, coronary artery disease, Crohn's disease, rheumatoid arthritis, and type 2 diabetes), with an average sample size of 260 K. We systematically compared disease-associated tissues and genes identified using proteomic versus transcriptomic data.</p><p><strong>Results: </strong>Tissue-specific protein abundance showed a moderate correlation with RNA expression (mean correlation coefficient = 0.46, 95% CI: 0.42-0.49). Proteomic data accurately identified disease-relevant tissues, such as the association between brain regions and schizophrenia and between coronary arteries and coronary artery disease. Compared to GWAS-based gene association estimates alone, incorporating proteomic data significantly improved gene association detection (AUC difference test, <i>p</i> = 0.0028). Furthermore, proteomic data revealed unique disease-associated genes that were not identified using transcriptomic data, such as the association between bipolar disorder and <i>CREB1</i>.</p><p><strong>Conclusions: </strong>Integrating proteomic data enables accurate identification of disease-associated tissues and provides irreplaceable advantages in fine-mapping genes for complex diseases.</p>","PeriodicalId":48624,"journal":{"name":"Biology-Basel","volume":"14 5","pages":""},"PeriodicalIF":3.6000,"publicationDate":"2025-05-16","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC12109507/pdf/","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Biology-Basel","FirstCategoryId":"99","ListUrlMain":"https://doi.org/10.3390/biology14050554","RegionNum":3,"RegionCategory":"生物学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q1","JCRName":"BIOLOGY","Score":null,"Total":0}
引用次数: 0
Abstract
Background: The tissues of origin and molecular mechanisms underlying human complex diseases remain incompletely understood. Previous studies have leveraged transcriptomic data to interpret genome-wide association studies (GWASs) for identifying disease-relevant tissues and fine-mapping causal genes. However, according to the central dogma, proteins more directly reflect cellular molecular activities than RNA. Therefore, in this study, we integrated proteomic data with GWAS to identify disease-associated tissues and genes.
Methods: We compiled proteomic and paired transcriptomic data for 12,229 genes across 32 human tissues from the GTEx project. Using three tissue inference approaches-S-LDSC, MAGMA, and DESE-we analyzed GWAS data for six representative complex diseases (bipolar disorder, schizophrenia, coronary artery disease, Crohn's disease, rheumatoid arthritis, and type 2 diabetes), with an average sample size of 260 K. We systematically compared disease-associated tissues and genes identified using proteomic versus transcriptomic data.
Results: Tissue-specific protein abundance showed a moderate correlation with RNA expression (mean correlation coefficient = 0.46, 95% CI: 0.42-0.49). Proteomic data accurately identified disease-relevant tissues, such as the association between brain regions and schizophrenia and between coronary arteries and coronary artery disease. Compared to GWAS-based gene association estimates alone, incorporating proteomic data significantly improved gene association detection (AUC difference test, p = 0.0028). Furthermore, proteomic data revealed unique disease-associated genes that were not identified using transcriptomic data, such as the association between bipolar disorder and CREB1.
Conclusions: Integrating proteomic data enables accurate identification of disease-associated tissues and provides irreplaceable advantages in fine-mapping genes for complex diseases.
期刊介绍:
Biology (ISSN 2079-7737) is an international, peer-reviewed, quick-refereeing open access journal of Biological Science published by MDPI online. It publishes reviews, research papers and communications in all areas of biology and at the interface of related disciplines. Our aim is to encourage scientists to publish their experimental and theoretical results in as much detail as possible. There is no restriction on the length of the papers. The full experimental details must be provided so that the results can be reproduced. Electronic files regarding the full details of the experimental procedure, if unable to be published in a normal way, can be deposited as supplementary material.