Robert Chen, Ghislain Rocheleau, Ben Omega Petrazzini, Iain S Forrest, Joshua K Park, Áine Duffy, Ha My T Vy, Daniel Jordan, Ron Do
{"title":"Genetic analyses of eight complex diseases using predicted continuous representations of disease.","authors":"Robert Chen, Ghislain Rocheleau, Ben Omega Petrazzini, Iain S Forrest, Joshua K Park, Áine Duffy, Ha My T Vy, Daniel Jordan, Ron Do","doi":"10.1016/j.crmeth.2025.101115","DOIUrl":null,"url":null,"abstract":"<p><p>We evaluated whether predicted continuous disease representations could enhance genetic discovery beyond case-control genome-wide association study (GWAS) phenotypes across eight complex diseases in up to 485,448 UK Biobank participants. Predicted phenotypes had high genetic correlations with case-control phenotypes (median r<sub>g</sub> = 0.66) but identified more independent associations (median 306 versus 125). While some predicted phenotype associations were spurious, multi-trait analysis of GWAS-boosted case-control phenotypes identified a median of 46 additional variants per disease, of which a median of 73% replicated in FinnGen, 37% reached genome-wide significance in a UK Biobank/FinnGen meta-analysis, and 45% had supporting evidence. Predicted phenotypes also identified 14 genes targeted by phase I-IV drugs not identified by case-control phenotypes, and combined polygenic risk scores (PRSs) using both phenotypes improved prediction performance, with a median 37% increase in Nagelkerke's R<sup>2</sup>. Predicted phenotypes represent composite biomarkers complementing case-control approaches in genetic discovery, drug target prioritization, and risk prediction, though efficacy varies across diseases.</p>","PeriodicalId":29773,"journal":{"name":"Cell Reports Methods","volume":" ","pages":"101115"},"PeriodicalIF":4.5000,"publicationDate":"2025-08-18","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC12461582/pdf/","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Cell Reports Methods","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1016/j.crmeth.2025.101115","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"2025/7/25 0:00:00","PubModel":"Epub","JCR":"Q1","JCRName":"BIOCHEMICAL RESEARCH METHODS","Score":null,"Total":0}
引用次数: 0
Abstract
We evaluated whether predicted continuous disease representations could enhance genetic discovery beyond case-control genome-wide association study (GWAS) phenotypes across eight complex diseases in up to 485,448 UK Biobank participants. Predicted phenotypes had high genetic correlations with case-control phenotypes (median rg = 0.66) but identified more independent associations (median 306 versus 125). While some predicted phenotype associations were spurious, multi-trait analysis of GWAS-boosted case-control phenotypes identified a median of 46 additional variants per disease, of which a median of 73% replicated in FinnGen, 37% reached genome-wide significance in a UK Biobank/FinnGen meta-analysis, and 45% had supporting evidence. Predicted phenotypes also identified 14 genes targeted by phase I-IV drugs not identified by case-control phenotypes, and combined polygenic risk scores (PRSs) using both phenotypes improved prediction performance, with a median 37% increase in Nagelkerke's R2. Predicted phenotypes represent composite biomarkers complementing case-control approaches in genetic discovery, drug target prioritization, and risk prediction, though efficacy varies across diseases.