REMOVING PLEIOTROPIC SIGNALS REVEAL DISEASE-SPECIFIC GENETIC ARCHITECTURE IN NOISY, SHALLOW BIOBANK PHENOTYPES

IF 6.7 2区医学 Q1 CLINICAL NEUROLOGY

European Neuropsychopharmacology Pub Date : 2025-10-01 DOI:10.1016/j.euroneuro.2025.08.547

Hyunkyung Kim , Na Cai , Andy Dahl

{"title":"REMOVING PLEIOTROPIC SIGNALS REVEAL DISEASE-SPECIFIC GENETIC ARCHITECTURE IN NOISY, SHALLOW BIOBANK PHENOTYPES","authors":"Hyunkyung Kim , Na Cai , Andy Dahl","doi":"10.1016/j.euroneuro.2025.08.547","DOIUrl":null,"url":null,"abstract":"<div><div>Pleiotropy is pervasive in complex traits, and understanding it is necessary to characterize shared vs specific genetic effects. Specific effects point to the core biology of a trait, which is especially challenging to characterize in heterogeneous traits such as major depressive disorder (MDD). Exploiting shared effects, on the other hand, can improve statistical power to detect genetic effects and exploit them for polygenic prediction. Large multi-trait genetic datasets, like the UK Biobank, provide opportunities to jointly model these shared and specific effects across thousands of related traits.</div><div>However, the standard approach to understand pleiotropy–genetic correlation–is overly simplistic as it only captures genome-wide aggregate similarity. While more recent approaches have extended genetic correlation to locus-level measures or factor models spanning many traits, it remains challenging to separate trait-specific effects from those that are broadly shared across related phenotypes. For example, genetic effects on alcohol use, and neuroticism will affect MDD, yet they are not specific to MDD nor likely to shed light on its core etiology. Here, we develop a Bayesian matrix factorization approach to address these limitations by partitioning high-dimensional pleiotropic relationships into effects that are shared vs specific to a focal trait of interest.</div><div>First, we applied our approach to simulated data to demonstrate it can reliably separate genetic effects that are specific to a trait vs that are mediated through secondary traits. Our approach outperforms other factorization-based approaches, such as conditioning on phenome-wide PCs. We then applied our approach to identify MDD-specific genetic effects in UK Biobank by accounting for shared genetic effects across 216 MDD-relevant traits. Specifically, we excluded the best-available measure, LifetimeMDD, and evaluated our ability to recapitulate this measure from two lower-quality measures, a GP-based measure and ICD10-based depression. We first show that our approach yields more specific phenotypes, which are more correlated to LifetimeMDD (R2s increase from 0.551 and 0.272 to 0.634 for the GP and ICD10 measures, respectively). Next, we showed that our approach yields better polygenic scores to predict LifetimeMDD (R2s increase from 0.081 and 0.035 to 0.097 for the GP and ICD10 measures, respectively; both p_bootstrap < .01).</div><div>Overall, our approach can be applied to any large-scale, noisy biobank phenotypes to improve their disorder-specificity. This is an important step toward bridging the gap between carefully-phenotyped datasets and shallowly-phenotyped datasets, which is essential for deriving powerful and specific genetic associations in complex traits.</div></div>","PeriodicalId":12049,"journal":{"name":"European Neuropsychopharmacology","volume":"99 ","pages":"Page 45"},"PeriodicalIF":6.7000,"publicationDate":"2025-10-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"European Neuropsychopharmacology","FirstCategoryId":"3","ListUrlMain":"https://www.sciencedirect.com/science/article/pii/S0924977X25007059","RegionNum":2,"RegionCategory":"医学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q1","JCRName":"CLINICAL NEUROLOGY","Score":null,"Total":0}

引用次数: 0

Abstract

Pleiotropy is pervasive in complex traits, and understanding it is necessary to characterize shared vs specific genetic effects. Specific effects point to the core biology of a trait, which is especially challenging to characterize in heterogeneous traits such as major depressive disorder (MDD). Exploiting shared effects, on the other hand, can improve statistical power to detect genetic effects and exploit them for polygenic prediction. Large multi-trait genetic datasets, like the UK Biobank, provide opportunities to jointly model these shared and specific effects across thousands of related traits.

However, the standard approach to understand pleiotropy–genetic correlation–is overly simplistic as it only captures genome-wide aggregate similarity. While more recent approaches have extended genetic correlation to locus-level measures or factor models spanning many traits, it remains challenging to separate trait-specific effects from those that are broadly shared across related phenotypes. For example, genetic effects on alcohol use, and neuroticism will affect MDD, yet they are not specific to MDD nor likely to shed light on its core etiology. Here, we develop a Bayesian matrix factorization approach to address these limitations by partitioning high-dimensional pleiotropic relationships into effects that are shared vs specific to a focal trait of interest.

First, we applied our approach to simulated data to demonstrate it can reliably separate genetic effects that are specific to a trait vs that are mediated through secondary traits. Our approach outperforms other factorization-based approaches, such as conditioning on phenome-wide PCs. We then applied our approach to identify MDD-specific genetic effects in UK Biobank by accounting for shared genetic effects across 216 MDD-relevant traits. Specifically, we excluded the best-available measure, LifetimeMDD, and evaluated our ability to recapitulate this measure from two lower-quality measures, a GP-based measure and ICD10-based depression. We first show that our approach yields more specific phenotypes, which are more correlated to LifetimeMDD (R2s increase from 0.551 and 0.272 to 0.634 for the GP and ICD10 measures, respectively). Next, we showed that our approach yields better polygenic scores to predict LifetimeMDD (R2s increase from 0.081 and 0.035 to 0.097 for the GP and ICD10 measures, respectively; both p_bootstrap < .01).

Overall, our approach can be applied to any large-scale, noisy biobank phenotypes to improve their disorder-specificity. This is an important step toward bridging the gap between carefully-phenotyped datasets and shallowly-phenotyped datasets, which is essential for deriving powerful and specific genetic associations in complex traits.

查看原文本刊更多论文

去除多效性信号揭示了嘈杂的浅生物库表型中疾病特异性遗传结构

多效性在复杂性状中普遍存在，了解共同遗传效应与特定遗传效应的特征是必要的。特定效应指向一个特征的核心生物学，这在异质性特征（如重度抑郁症（MDD））中尤其具有挑战性。另一方面，利用共享效应可以提高检测遗传效应的统计能力，并利用它们进行多基因预测。大型多性状遗传数据集，如英国生物银行，提供了在数千个相关性状中共同建模这些共享和特定影响的机会。然而，理解多效性——遗传相关——的标准方法过于简单，因为它只捕获全基因组的总相似性。虽然最近的方法已经将遗传相关性扩展到基因座水平的测量或跨越许多性状的因子模型，但将性状特异性效应与在相关表型中广泛共享的效应分开仍然具有挑战性。例如，基因对酒精使用和神经质的影响会影响重度抑郁症，但它们不是重度抑郁症所特有的，也不太可能阐明其核心病因。在这里，我们开发了一种贝叶斯矩阵分解方法，通过将高维多效关系划分为共享效应和特定于感兴趣的焦点特征的效应来解决这些限制。首先，我们将我们的方法应用于模拟数据，以证明它可以可靠地分离特定于性状的遗传效应与通过次要性状介导的遗传效应。我们的方法优于其他基于因式分解的方法，例如在全域pc上进行条件反射。然后，我们通过计算216个mdd相关性状的共享遗传效应，应用我们的方法在UK Biobank中识别mdd特异性遗传效应。具体来说，我们排除了最好的测量方法，LifetimeMDD，并从两个较低质量的测量方法（基于gp的测量方法和基于icd10的抑郁方法）中评估了我们概括该测量方法的能力。我们首先表明，我们的方法产生了更具体的表型，这些表型与LifetimeMDD更相关（GP和ICD10测量的R2s分别从0.551和0.272增加到0.634）。接下来，我们证明了我们的方法可以产生更好的多基因分数来预测LifetimeMDD （GP和ICD10测量的R2s分别从0.081和0.035增加到0.097；两者都是p_bootstrap <； 0.01）。总的来说，我们的方法可以应用于任何大规模的、嘈杂的生物库表型，以提高它们的疾病特异性。这是弥合仔细表型数据集和浅表型数据集之间差距的重要一步，这对于在复杂性状中获得强大和特定的遗传关联至关重要。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

求助全文

约1分钟内获得全文求助全文

来源期刊

European Neuropsychopharmacology 医学-精神病学

CiteScore

10.30

自引率

5.40%

发文量

730

审稿时长

41 days

期刊介绍： European Neuropsychopharmacology is the official publication of the European College of Neuropsychopharmacology (ECNP). In accordance with the mission of the College, the journal focuses on clinical and basic science contributions that advance our understanding of brain function and human behaviour and enable translation into improved treatments and enhanced public health impact in psychiatry. Recent years have been characterized by exciting advances in basic knowledge and available experimental techniques in neuroscience and genomics. However, clinical translation of these findings has not been as rapid. The journal aims to narrow this gap by promoting findings that are expected to have a major impact on both our understanding of the biological bases of mental disorders and the development and improvement of treatments, ideally paving the way for prevention and recovery.