Jane W. Liang, Gregory E. Idos, Christine Hong, Kristen M. Shannon, Lauren M. Bear, Jennifer Morales Pichardo, Zoe Guan, Anne Marie McCarthy, James M. Ford, Allison W. Kurian, Stephen B. Gruber, Danielle Braun, Giovanni Parmigiani
{"title":"Evaluating a Mendelian Risk Prediction Model That Aggregates Across Genes and Cancers","authors":"Jane W. Liang, Gregory E. Idos, Christine Hong, Kristen M. Shannon, Lauren M. Bear, Jennifer Morales Pichardo, Zoe Guan, Anne Marie McCarthy, James M. Ford, Allison W. Kurian, Stephen B. Gruber, Danielle Braun, Giovanni Parmigiani","doi":"10.1002/gepi.70038","DOIUrl":"10.1002/gepi.70038","url":null,"abstract":"<div>\u0000 \u0000 <p>Using principles of Mendelian genetics, probability theory, and mutation-specific knowledge, Mendelian risk prediction models identify those at high risk of carrying a heritable cancer susceptibility variant and assess future risk of cancer. Our previously-validated Fam3PRO model is a generalizable and computationally efficient Mendelian risk prediction framework that incorporates an arbitrary number of gene-cancer associations. In practice, from a model training perspective, there may be uncertainty in estimating the population-level model parameters necessary for rare gene-cancer associations. From a clinical perspective, it may be infeasible to obtain a detailed patient family history for many cancers. Motivated by the context of pre-screening for germline testing of a broad hereditary cancer gene panel, we propose a Mendelian model that aggregates information across genes and cancers, reducing patient burden and bypassing the need for robust parameter estimation for rare genes and syndromes. We evaluated this aggregate model through simulations and applied it to two independent clinical cohorts. We show that when the clinical goal is to assess patient risk of carrying a pathogenic variant for any cancer susceptibility gene, the aggregate model can give results comparable to a Mendelian model that considers many genes and cancers individually, while greatly simplifying model assumptions and user input.</p>\u0000 </div>","PeriodicalId":12710,"journal":{"name":"Genetic Epidemiology","volume":"50 3","pages":""},"PeriodicalIF":3.8,"publicationDate":"2026-03-07","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"147372927","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"医学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Rafael A. Nafikov, Harkirat K. Sohi, Alejandro Q. Nato Jr., Andrea R. Horimoto, Tyler R. C. Day, Thomas D. Bird, Anita L. DeStefano, Elizabeth E. Blue, Ellen M. Wijsman
{"title":"Variant Prioritization by Pedigree-Based Haplotyping","authors":"Rafael A. Nafikov, Harkirat K. Sohi, Alejandro Q. Nato Jr., Andrea R. Horimoto, Tyler R. C. Day, Thomas D. Bird, Anita L. DeStefano, Elizabeth E. Blue, Ellen M. Wijsman","doi":"10.1002/gepi.70039","DOIUrl":"10.1002/gepi.70039","url":null,"abstract":"<p>Whole genome sequence (WGS) data provides opportunities for comprehensive evaluation of variants that may influence complex traits. However, prioritizing the large number of variants, particularly those in non-coding regions, is a challenge. Here we present an approach that uses pedigree-based haplotyping to identify the risk haplotype and resulting set of prioritized variants in a region of interest (ROI) defined by identity-by-descent (IBD) sharing among familial cases. The approach is applicable for use in both a full range of pedigree sizes and for the full allele frequency spectrum of variants without the need for a large reference sample. By determining haplotype sharing among individuals with WGS data, we demonstrate the ability to accurately identify a risk haplotype and a strongly reduced list of potential risk alleles for a trait of interest along with the cases who carry the risk haplotype. This is important in the context of complex traits where the disease may be etiologically heterogeneous even within a single pedigree. Application to both simulated and real Alzheimer's disease family data shows that the approach leads to accurate risk-haplotype identification with marked reduction in the number of potential trait-associated variants. Simulation also shows that the approach provides accurate risk haplotypes in ROIs.</p>","PeriodicalId":12710,"journal":{"name":"Genetic Epidemiology","volume":"50 3","pages":""},"PeriodicalIF":3.8,"publicationDate":"2026-03-07","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC12967030/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"147372884","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"医学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Karina Patasova, Bahar Sedaghati-Khayat, Rachel Knevel, Heather J. Cordell, Arthur G. Pratt
{"title":"Methods for Prioritizing Causal Genes in Molecular Studies of Human Disease: The State of the Art","authors":"Karina Patasova, Bahar Sedaghati-Khayat, Rachel Knevel, Heather J. Cordell, Arthur G. Pratt","doi":"10.1002/gepi.70037","DOIUrl":"10.1002/gepi.70037","url":null,"abstract":"<p>In the last decade, genome-wide association studies (GWAS) have identified tens of thousands of common variants associated with a wide array of complex traits and diseases. Integration of GWAS with molecular data has informed the development of statistical tools for causal gene discovery. In this paper, we give an overview of commonly used causal inference methods and discuss the strengths and limitations of colocalization, Mendelian randomization (MR) and network-based approaches. Colocalization is often used to assess whether the genetic association signals for two traits arise from the same causal variant, thereby strengthening inferred causal associations. MR was developed to tackle issues of confounding and reverse causality, providing a rigorous approach to causal inference and demonstrating improved false discovery rates. Unlike MR, network-based analyses employ a discovery approach and model complex relationships between multiple variables. All causal inference methods are, to varying degrees, susceptible to spurious associations due to genetic confounding, pleiotropy and linkage disequilibrium. Here, we discuss the latest developments in the field of causal gene inference and limitations of these methods. We give an overview of interplay between different approaches as well as practical applications with reference to published examples in context of heart disease.</p>","PeriodicalId":12710,"journal":{"name":"Genetic Epidemiology","volume":"50 3","pages":""},"PeriodicalIF":3.8,"publicationDate":"2026-03-02","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC12952701/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"147343852","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"医学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Correction to “Multivariable MR Can Mitigate Bias in Two-Sample MR Using Covariable-Adjusted Summary Associations”","authors":"","doi":"10.1002/gepi.70035","DOIUrl":"10.1002/gepi.70035","url":null,"abstract":"<p>Gilbody, J., Borges, M.C., Davey Smith, G. and Sanderson, E. (2025), Multivariable MR Can Mitigate Bias in Two-Sample MR Using Covariable-Adjusted Summary Associations. Genetic Epidemiology, 49: e22606. https://doi.org/10.1002/gepi.22606</p><p>In the originally published article, the funding code from the British Heart Foundation was given incorrectly. The correct code is AA/18/1/34219. The online version of this article has been corrected.</p><p>We apologize for this error.</p>","PeriodicalId":12710,"journal":{"name":"Genetic Epidemiology","volume":"50 2","pages":""},"PeriodicalIF":3.8,"publicationDate":"2026-02-23","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://onlinelibrary.wiley.com/doi/epdf/10.1002/gepi.70035","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"147270787","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"医学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Integrative Harmonization of Phenotypic and Genomic Data Improves Bone Mineral Density Prediction in Multi-Study Osteoporosis Research","authors":"Anqi Liu, Jianing Liu, Lang Wu, Qing Wu","doi":"10.1002/gepi.70028","DOIUrl":"10.1002/gepi.70028","url":null,"abstract":"<p>Harmonizing osteoporosis-related data across multiple data sets is essential for improving the accuracy and generalizability of bone mineral density (BMD) assessments. This study developed a harmonization framework to standardize phenotypic and genomic variables across three major US osteoporosis data sets: GDBF, GWAS, and NHANES. We standardized key phenotypic variables (BMD, body mass index (BMI), age, sex, and race/ethnicity) using cohort-specific data dictionaries and applied multiple imputations by chained equations (MICEs) to manage missing data. Genomic data were harmonized using principal component analysis (PCA)-based batch effect corrections. Residual regression methods were applied to standardize BMD values. The effectiveness of harmonization on BMD prediction was evaluated using generalized estimating equations (GEEs) and mixed-effects models. Post-harmonization, inter-study variability in BMI was significantly reduced (<i>Ω</i><sup>2</sup> = 0.0028), and BMD associations with covariates remained consistent across data sets. Harmonized models showed improved predictive performance, with explained variance in BMD increasing (<i>R</i><sup>2</sup> = 0.14%). PCA confirmed the effective alignment of genetic data, reducing batch effects and improving cross-study compatibility. This study demonstrates the feasibility and effectiveness of harmonizing phenotypic and genomic data for osteoporosis research. The harmonization framework enhances BMD prediction accuracy, supports more inclusive osteoporosis risk assessment, and improves the integration of multi-cohort data sets for future research. These findings highlight the potential of data harmonization in advancing precision medicine for osteoporosis prevention and management.</p>","PeriodicalId":12710,"journal":{"name":"Genetic Epidemiology","volume":"50 1","pages":""},"PeriodicalIF":3.8,"publicationDate":"2026-01-27","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC12836449/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"146051694","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"医学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Benjamin Woolf, Amy Mason, Chin Yang Shapland, Hyunseung Kang, Hannah M. Sallis, Stephen Burgess, Marcus R. Munafò
{"title":"Extending the Use of Mendelian Randomisation With Non-Inherited Variants to Assess Socially Transmitted Parental Exposures Under Assortative Mating","authors":"Benjamin Woolf, Amy Mason, Chin Yang Shapland, Hyunseung Kang, Hannah M. Sallis, Stephen Burgess, Marcus R. Munafò","doi":"10.1002/gepi.70031","DOIUrl":"10.1002/gepi.70031","url":null,"abstract":"<p>A longstanding aim of developmental psychology and epidemiology is to understand the causal effects of parental phenotypes on offspring outcomes. Traditional approaches often fail to account for confounding and reverse causation. We evaluate the use of Mendelian randomisation with non-inherited variants (MR-NIV) to address these limitations. MR-NIV leverages non-inherited genetic variants to instrument the parental phenotype independent of the offspring's genotype. We used Directed Acyclic Graphs and simulations to validate MR-NIV and explore robustness to assortative mating. In contrast to an alternative MR method which adjusts the parental genotype for offspring genotype, MR-NIV can be robust to assortative mating when used without trio data. In settings without trio data, MR-NIV outperformed the adjustment method. The adjustment method outperformed MR-NIV in settings with trio data. Applying MR-NIV to the Avon Longitudinal Study of Parents and Children, we assessed the causal effect of parental smoking on offspring smoking initiation at age 16. Results were consistent with observational studies, suggesting a meaningful increase in the risk of offspring smoking due to parental smoking. However, larger sample sizes will be necessary to provide a precise answer. MR-NIV offers a promising extension of Mendelian randomisation for studying the developmental environment.</p>","PeriodicalId":12710,"journal":{"name":"Genetic Epidemiology","volume":"50 1","pages":""},"PeriodicalIF":3.8,"publicationDate":"2026-01-21","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC12820921/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"146010103","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"医学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Md. Moksedul Momin, Xuan Zhou, Muktar Ahmed, Elina Hyppönen, Beben Benyamin, S. Hong Lee
{"title":"Cross-Ancestry Polygenic Prediction: Comparing Methods and Assessing Transferability Across Traits","authors":"Md. Moksedul Momin, Xuan Zhou, Muktar Ahmed, Elina Hyppönen, Beben Benyamin, S. Hong Lee","doi":"10.1002/gepi.70029","DOIUrl":"10.1002/gepi.70029","url":null,"abstract":"<p>Accurate prediction of disease risk and other complex traits across different populations is essential for clinical and research purposes. However, genetic differences among ancestries, such as allelic frequencies and genetic architecture, can affect the performance of polygenic risk score (PGS) methods in cross-ancestry prediction. To address this issue, we conducted a formal test of seven polygenic prediction methods applicable across ancestries for five traits (BMI, standing height, LDL-, HDL- and total-cholesterol) from the UK Biobank dataset. We demonstrate that, GBLUP and PRS-CSx outperformed other methods for highly polygenic traits like height and BMI. In contrast, PRSice and PolyPred performed best for less polygenic traits like cholesterol, with PRS-CSx being comparable with larger sample sizes. We also observed that utilizing concordant SNPs, which have the same effect direction across diverse ancestries, can improve the accuracy of cross-ancestry PGS models. Furthermore, we found that the transferability of PGS across ancestries varied depending on the trait. Understanding the strengths and limitations of different methods and approaches is important for future methodological development and improvement, enabling better interpretation and application of PGS results in clinical and research settings.</p>","PeriodicalId":12710,"journal":{"name":"Genetic Epidemiology","volume":"50 1","pages":""},"PeriodicalIF":3.8,"publicationDate":"2026-01-21","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC12820924/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"146010071","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"医学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"MELODY: Mediation Analysis in Logistic Regression for High-Dimensional Mediators and a Binary Outcome","authors":"Sunyi Chi, Xingyu Li, Peng Wei, Xuelin Huang","doi":"10.1002/gepi.70033","DOIUrl":"10.1002/gepi.70033","url":null,"abstract":"<p>Mediation analysis is a pivotal tool for elucidating the indirect effect of an environmental factor or treatment on disease through potentially high-dimensional omics data, such as gene expression profiles. However, traditional mediation analysis methods tailored for binary outcomes often rely on the rare disease assumption in logistic regression and provide inadequate measures of total mediation effect when multiple mediators have effects in different directions. In this paper, we develop a MEdiation analysis framework in LOgistic regression for high-Dimensional mediators and a binarY outcome (MELODY). It leverages a second-moment-based measure analogous to the <span></span><math>\u0000 <semantics>\u0000 <mrow>\u0000 \u0000 <mrow>\u0000 <msup>\u0000 <mi>R</mi>\u0000 \u0000 <mn>2</mn>\u0000 </msup>\u0000 </mrow>\u0000 </mrow>\u0000 <annotation> ${R}^{2}$</annotation>\u0000 </semantics></math> for linear models to quantify the total mediation effect. We also develop a variable selection procedure for high-dimensional data to reduce bias introduced by non-mediators. Our comprehensive simulations demonstrate the superior performance of MELODY in scenarios with non-rare disease binary outcomes and high-dimensional mediators. We apply MELODY to the Framingham Heart Study of over 5000 individuals to analyze the mediation effects of metabolomics and transcriptomics data on the pathways from sex to incident coronary heart disease.</p>","PeriodicalId":12710,"journal":{"name":"Genetic Epidemiology","volume":"50 1","pages":""},"PeriodicalIF":3.8,"publicationDate":"2026-01-20","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC12820538/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"146010084","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"医学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"A Linear Mixed Model With Measurement Error Correction (LMM-MEC): A Method for Summary-Data-Based Multivariable Mendelian Randomization","authors":"Ming Ding, Fei Zou","doi":"10.1002/gepi.70032","DOIUrl":"https://doi.org/10.1002/gepi.70032","url":null,"abstract":"<div>\u0000 \u0000 <p>Summary-data-based multivariable Mendelian randomization (MVMR) methods, such as MVMR-Egger, MVMR-IVW, MVMR median-based, and MVMR-PRESSO, are used to assess the causal effects of multiple risk factors on disease. However, accounting for variances in the summary statistics of risk factors remains a challenge. We propose a linear mixed model with measurement error correction (LMM-MEC) that accounts for the variance in summary statistics for both disease outcomes and risk factors. First, under the NOME assumption, we apply a linear mixed model to account for variance in disease summary statistics by treating it as fixed- or random-effects, depending on whether there is heterogeneity in the effect sizes of the genetic variants on the disease outcome. Next, we relax the NOME assumption and further take the estimation error (or variance) in the summary statistics of risk factors into consideration by measurement models through a regression calibration approach. In a simulation study, using independent genetic variants as instrumental variables (IV), our method showed comparable performance to existing MVMR methods under conditions of no pleiotropy or with balanced pleiotropy on the disease outcome, and it achieved slightly improved coverage rates and power under directional pleiotropy. When genetic variants are in low to moderate linkage disequilibrium (LD) (0 < <span></span><math>\u0000 <semantics>\u0000 <mrow>\u0000 \u0000 <mrow>\u0000 <mi>ρ</mi>\u0000 </mrow>\u0000 </mrow>\u0000 <annotation> $rho $</annotation>\u0000 </semantics></math><sup>2</sup> ≤ 0.3), our method showed comparable performance to MVMR-Egger, although both methods showed reduced coverage rates and power compared to situations where genetic variants as IVs are in LD. In the application study, we examined causal associations between correlated cholesterol biomarkers and longevity. By including 739 genetic variants selected based on <i>p</i> values < 5 × 10<sup>−5</sup> from GWAS and allowing for low LD (<span></span><math>\u0000 <semantics>\u0000 <mrow>\u0000 \u0000 <mrow>\u0000 <mi>ρ</mi>\u0000 </mrow>\u0000 </mrow>\u0000 <annotation> $rho $</annotation>\u0000 </semantics></math><sup>2</sup> ≤ 0.1), our method identified that large LDL-c levels were causally associated with a lower likelihood of achieving longevity.</p>\u0000 </div>","PeriodicalId":12710,"journal":{"name":"Genetic Epidemiology","volume":"50 1","pages":""},"PeriodicalIF":3.8,"publicationDate":"2026-01-19","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"146002510","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"医学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Large-Scale Genotype-Based Trait Imputation With Multi-Ancestry GWAS Data","authors":"Jingchen Ren, Wei Pan","doi":"10.1002/gepi.70030","DOIUrl":"10.1002/gepi.70030","url":null,"abstract":"<p>Genome-wide association studies (GWAS) have been instrumental in identifying genetic variants associated with complex traits and diseases, including Alzheimer's disease (AD). However, traditional GWAS approaches often focus on European populations, which may lead to loss of power and limit the generalizability of findings across diverse ancestries. On the other hand, LS-Imputation, a nonparametric trait imputation method, leverages GWAS summary statistics and genotype data to impute missing traits, which can then be used for GWAS and other downstream analyses. Although LS-Imputation has been applied successfully to European populations, its performance in non-European populations would be hindered by smaller sample sizes, leading to reduced imputation accuracy. To address these limitations, we propose two novel variants of LS-Imputation-LS-Imputation-Combined and LS-Imputation-Transfer—designed to integrate multi-ancestry GWAS data and enhance imputation performance. LS-Imputation-Combined optimally combines GWAS summary statistics from multiple ancestries, while LS-Imputation-Transfer sequentially refines imputed trait values across ancestries using stochastic gradient descent. We evaluate these methods using data from the UK Biobank and the Alzheimer's Disease Sequencing Project (ADSP), first applying them to high-density lipoprotein (HDL) cholesterol levels as a proof-of-concept before focusing on imputing AD status in Black individuals for genetic association analysis. Our results demonstrate that integrating multi-ancestry GWAS data improves trait imputation accuracy, with LS-Imputation-Transfer achieving the highest performance.</p>","PeriodicalId":12710,"journal":{"name":"Genetic Epidemiology","volume":"50 1","pages":""},"PeriodicalIF":3.8,"publicationDate":"2026-01-15","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC12805644/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"145984816","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"医学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}