{"title":"Deep learning identified genetic variants for COVID-19-related mortality among 28,097 affected cases in UK Biobank","authors":"Zihuan Liu, Wei Dai, Shiying Wang, Yisha Yao, Heping Zhang","doi":"10.1002/gepi.22515","DOIUrl":"10.1002/gepi.22515","url":null,"abstract":"<p>Analysis of host genetic components provides insights into the susceptibility and response to viral infection such as severe acute respiratory syndrome coronavirus 2 (SARS-CoV-2), which causes coronavirus disease 2019 (COVID-19). To reveal genetic determinants of susceptibility to COVID-19 related mortality, we train a deep learning model to identify groups of genetic variants and their interactions that contribute to the COVID-19 related mortality risk using the UK Biobank data (28,097 affected cases and 1656 deaths). We refer to such groups of variants as super variants. We identify 15 super variants with various levels of significance as susceptibility loci for COVID-19 mortality. Specifically, we identify a super variant (odds ratio [OR] = 1.594, <i>p</i> = 5.47 × 10<sup>−9</sup>) on Chromosome 7 that consists of the minor allele of rs76398985, rs6943608, rs2052130, 7:150989011_CT_C, rs118033050, and rs12540488. We also discover a super variant (OR = 1.353, <i>p</i> = 2.87 × 10<sup>−8</sup>) on Chromosome 5 that contains rs12517344, rs72733036, rs190052994, rs34723029, rs72734818, 5:9305797_GTA_G, and rs180899355.</p>","PeriodicalId":12710,"journal":{"name":"Genetic Epidemiology","volume":"47 3","pages":"215-230"},"PeriodicalIF":2.1,"publicationDate":"2023-01-24","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://onlinelibrary.wiley.com/doi/epdf/10.1002/gepi.22515","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"9184940","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"医学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Vasilios Karageorgiou, Jess Tyrrell, Trevelyan J. Mckinley, Jack Bowden
{"title":"Weak and pleiotropy robust sex-stratified Mendelian randomization in the one sample and two sample settings","authors":"Vasilios Karageorgiou, Jess Tyrrell, Trevelyan J. Mckinley, Jack Bowden","doi":"10.1002/gepi.22512","DOIUrl":"10.1002/gepi.22512","url":null,"abstract":"<div>\u0000 \u0000 \u0000 <section>\u0000 \u0000 <h3> Background</h3>\u0000 \u0000 <p>Mendelian randomization (MR) leverages genetic data as an instrumental variable to provide estimates for the causal effect of an exposure <i>X</i> on a health outcome <i>Y</i> that is robust to confounding. Unfortunately, horizontal pleiotropy—the direct association of a genetic variant with multiple phenotypes—is highly prevalent and can easily render a genetic variant an invalid instrument.</p>\u0000 </section>\u0000 \u0000 <section>\u0000 \u0000 <h3> Methods</h3>\u0000 \u0000 <p>Building on existing work, we propose a simple method for leveraging sex-specific genetic associations to perform weak and pleiotropy-robust MR analysis. This is achieved by constructing an MR estimator in which pleiotropy is perfectly removed by cancellation, while placing it within the powerful machinery of the robust adjusted profile score (MR-RAPS) method. Pleiotropy cancellation has the attractive property that it removes heterogeneity and therefore justifies a statistically efficient fixed effects model. We extend the method from the typical two-sample summary-data MR setting to the one-sample setting by adapting the technique of Collider-Correction. Simulation studies and applied examples are used to assess how the sex-stratified MR-RAPS estimator performs against other common approaches.</p>\u0000 </section>\u0000 \u0000 <section>\u0000 \u0000 <h3> Results</h3>\u0000 \u0000 <p>The sex-stratified MR-RAPS method is shown to be robust to pleiotropy even in cases where all genetic variants violated the standard Instrument Strength Independent of Direct Effect assumption. In some cases where the strength of the pleiotropic effect additionally varied by sex (and so perfect cancellation was not achieved), over-dispersed MR-RAPS implementations can still consistently estimate the true causal effect. In applied analyses, we investigate the causal effect of waist-hip ratio (WHR), an important marker of central obesity, on a range of downstream traits. While the conventional approaches suggested paradoxical links between WHR and height and body mass index, the sex-stratified approach obtained a more realistic null effect. Nonzero effects were also detected for systolic and diastolic blood pressure as well as high-density and low-density lipoprotein cholesterol.</p>\u0000 </section>\u0000 \u0000 <section>\u0000 \u0000 <h3> Discussion</h3>\u0000 \u0000 <p>We provide a simple but attractive method for weak and pleiotropy robust causal estimation of sexually dimorphic traits on downstream outcomes, by combining several existing approaches in a novel fashion.</p>\u0000 </section>\u0000 </div>","PeriodicalId":12710,"journal":{"name":"Genetic Epidemiology","volume":"47 2","pages":"135-151"},"PeriodicalIF":2.1,"publicationDate":"2023-01-22","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://onlinelibrary.wiley.com/doi/epdf/10.1002/gepi.22512","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"10816122","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"医学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Eric S. Kawaguchi, Andre E. Kim, Juan Pablo Lewinger, W. James Gauderman
{"title":"Improved two-step testing of genome-wide gene–environment interactions","authors":"Eric S. Kawaguchi, Andre E. Kim, Juan Pablo Lewinger, W. James Gauderman","doi":"10.1002/gepi.22509","DOIUrl":"10.1002/gepi.22509","url":null,"abstract":"<p>Two-step tests for gene–environment (<math>\u0000 <semantics>\u0000 <mrow>\u0000 <mi>G</mi>\u0000 \u0000 <mo>×</mo>\u0000 \u0000 <mi>E</mi>\u0000 </mrow>\u0000 <annotation> $Gtimes E$</annotation>\u0000 </semantics></math>) interactions exploit marginal single-nucleotide polymorphism (SNP) effects to improve the power of a genome-wide interaction scan. They combine a screening step based on marginal effects used to “bin” SNPs for weighted hypothesis testing in the second step to deliver greater power over single-step tests while preserving the genome-wide Type I error. However, the presence of many SNPs with detectable marginal effects on the trait of interest can reduce power by “displacing” true interactions with weaker marginal effects and by adding to the number of tests that need to be corrected for multiple testing. We introduce a new significance-based allocation into bins for Step-2 <math>\u0000 <semantics>\u0000 <mrow>\u0000 <mi>G</mi>\u0000 \u0000 <mo>×</mo>\u0000 \u0000 <mi>E</mi>\u0000 </mrow>\u0000 <annotation> $Gtimes E$</annotation>\u0000 </semantics></math> testing that overcomes the displacement issue and propose a computationally efficient approach to account for multiple testing within bins. Simulation results demonstrate that these simple improvements can provide substantially greater power than current methods under several scenarios. An application to a multistudy collaboration for understanding colorectal cancer reveals a <i>G</i> × Sex interaction located near the SMAD7 gene.</p>","PeriodicalId":12710,"journal":{"name":"Genetic Epidemiology","volume":"47 2","pages":"152-166"},"PeriodicalIF":2.1,"publicationDate":"2022-12-26","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://onlinelibrary.wiley.com/doi/epdf/10.1002/gepi.22509","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"10811874","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"医学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Efficient identification of trait-associated loss-of-function variants in the UK Biobank cohort by exome-sequencing based genotype imputation","authors":"Wen-Yuan Yu, Shan-Shan Yan, Shu-Han Zhang, Jing-Jing Ni, Bin-Li, Yu-Fang Pei, Lei Zhang","doi":"10.1002/gepi.22511","DOIUrl":"10.1002/gepi.22511","url":null,"abstract":"<p>The large-scale open access whole-exome sequencing (WES) data of the UK Biobank ~200,000 participants is accelerating a new wave of genetic association studies aiming to identify rare and functional loss-of-function (LoF) variants associated with complex traits and diseases. We proposed to merge the WES genotypes and the genome-wide genotyping (GWAS) genotypes of 167,000 UKB homogeneous European participants into a combined reference panel, and then to impute 241,911 UKB homogeneous European participants who had the GWAS genotypes only. We then used the imputed data to replicate association identified in the discovery WES sample. The average imputation accuracy measure <i>r</i><sup>2</sup> is modest to high for LoF variants at all minor allele frequency intervals: 0.942 at MAF interval (0.01, 0.5), 0.807 at (1.0 × 10<sup>−3</sup>, 0.01), 0.805 at (1.0 × 10<sup>−4</sup>, 1.0 × 10<sup>−3</sup>), 0.664 at (1.0 × 10<sup>−5</sup>, 1.0 × 10<sup>−4</sup>) and 0.410 at (0, 1.0 × 10<sup>−5</sup>). As applications, we studied associations of LoF variants with estimated heel BMD and four lipid traits. In addition to replicating dozens of previously reported genes, we also identified three novel associations, two genes <i>PLIN1</i> and <i>ANGPTL3</i> for high-density-lipoprotein cholesterol and one gene <i>PDE3B</i> for triglycerides. Our results highlighted the strength of WES based genotype imputation as well as provided useful imputed data within the UKB cohort.</p>","PeriodicalId":12710,"journal":{"name":"Genetic Epidemiology","volume":"47 2","pages":"121-134"},"PeriodicalIF":2.1,"publicationDate":"2022-12-09","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"10854554","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"医学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Jiacong Du, Xiang Zhou, Dylan Clark-Boucher, Wei Hao, Yongmei Liu, Jennifer A. Smith, Bhramar Mukherjee
{"title":"Methods for large-scale single mediator hypothesis testing: Possible choices and comparisons","authors":"Jiacong Du, Xiang Zhou, Dylan Clark-Boucher, Wei Hao, Yongmei Liu, Jennifer A. Smith, Bhramar Mukherjee","doi":"10.1002/gepi.22510","DOIUrl":"10.1002/gepi.22510","url":null,"abstract":"<p>Mediation hypothesis testing for a large number of mediators is challenging due to the composite structure of the null hypothesis, <math>\u0000 <semantics>\u0000 <mrow>\u0000 <msub>\u0000 <mi>H</mi>\u0000 \u0000 <mn>0</mn>\u0000 </msub>\u0000 \u0000 <mo>:</mo>\u0000 \u0000 <mi>α</mi>\u0000 \u0000 <mi>β</mi>\u0000 \u0000 <mo>=</mo>\u0000 \u0000 <mn>0</mn>\u0000 </mrow>\u0000 <annotation> ${H}_{0}:alpha beta =0$</annotation>\u0000 </semantics></math> (<math>\u0000 <semantics>\u0000 <mrow>\u0000 <mi>α</mi>\u0000 </mrow>\u0000 <annotation> $alpha $</annotation>\u0000 </semantics></math>: effect of the exposure on the mediator after adjusting for confounders; <math>\u0000 <semantics>\u0000 <mrow>\u0000 <mi>β</mi>\u0000 </mrow>\u0000 <annotation> $beta $</annotation>\u0000 </semantics></math>: effect of the mediator on the outcome after adjusting for exposure and confounders). In this paper, we reviewed three classes of methods for large-scale one at a time mediation hypothesis testing. These methods are commonly used for continuous outcomes and continuous mediators assuming there is no exposure-mediator interaction so that the product <math>\u0000 <semantics>\u0000 <mrow>\u0000 <mi>α</mi>\u0000 \u0000 <mi>β</mi>\u0000 </mrow>\u0000 <annotation> $alpha beta $</annotation>\u0000 </semantics></math> has a causal interpretation as the indirect effect. The first class of methods ignores the impact of different structures under the composite null hypothesis, namely, (1) <math>\u0000 <semantics>\u0000 <mrow>\u0000 <mi>α</mi>\u0000 \u0000 <mo>=</mo>\u0000 \u0000 <mn>0</mn>\u0000 \u0000 <mo>,</mo>\u0000 \u0000 <mi>β</mi>\u0000 \u0000 <mo>≠</mo>\u0000 \u0000 <mn>0</mn>\u0000 </mrow>\u0000 <annotation> $alpha =0,beta ne 0$</annotation>\u0000 </semantics></math>; (2) <math>\u0000 <semantics>\u0000 <mrow>\u0000 <mi>α</mi>\u0000 \u0000 <mo>≠</mo>\u0000 \u0000 <mn>0</mn>\u0000 \u0000 <mo>,</mo>\u0000 \u0000 <mi>β</mi>\u0000 \u0000 <mo>=</mo>\u0000 \u0000 <mn>0</mn>\u0000 </mrow>\u0000 ","PeriodicalId":12710,"journal":{"name":"Genetic Epidemiology","volume":"47 2","pages":"167-184"},"PeriodicalIF":2.1,"publicationDate":"2022-12-05","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://onlinelibrary.wiley.com/doi/epdf/10.1002/gepi.22510","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"9762740","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"医学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Riddhi Pratim Ghosh, Arnab K. Maity, Mohsen Pourahmadi, Bani K. Mallick
{"title":"Adaptive Bayesian variable clustering via structural learning of breast cancer data","authors":"Riddhi Pratim Ghosh, Arnab K. Maity, Mohsen Pourahmadi, Bani K. Mallick","doi":"10.1002/gepi.22507","DOIUrl":"10.1002/gepi.22507","url":null,"abstract":"<p>The clustering of proteins is of interest in cancer cell biology. This article proposes a hierarchical Bayesian model for protein (variable) clustering hinging on correlation structure. Starting from a multivariate normal likelihood, we enforce the clustering through prior modeling using angle-based unconstrained reparameterization of correlations and assume a truncated Poisson distribution (to penalize a large number of clusters) as prior on the number of clusters. The posterior distributions of the parameters are not in explicit form and we use a reversible jump Markov chain Monte Carlo based technique is used to simulate the parameters from the posteriors. The end products of the proposed method are estimated cluster configuration of the proteins (variables) along with the number of clusters. The Bayesian method is flexible enough to cluster the proteins as well as estimate the number of clusters. The performance of the proposed method has been substantiated with extensive simulation studies and one protein expression data with a hereditary disposition in breast cancer where the proteins are coming from different pathways.</p>","PeriodicalId":12710,"journal":{"name":"Genetic Epidemiology","volume":"47 1","pages":"95-104"},"PeriodicalIF":2.1,"publicationDate":"2022-11-15","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"10718634","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"医学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Jerry Z. Zhang, Lacey W. Heinsberg, Mohanraj Krishnan, Nicola L. Hawley, Tanya J. Major, Jenna C. Carlson, Jennie Harré Hindmarsh, Huti Watson, Muhammad Qasim, Lisa K. Stamp, Nicola Dalbeth, Rinki Murphy, Guangyun Sun, Hong Cheng, Take Naseri, Muagututi'a S. Reupena, Erin E. Kershaw, Ranjan Deka, Stephen T. McGarvey, Ryan L. Minster, Tony R. Merriman, Daniel E. Weeks
{"title":"Multivariate analysis of a missense variant in CREBRF reveals associations with measures of adiposity in people of Polynesian ancestries","authors":"Jerry Z. Zhang, Lacey W. Heinsberg, Mohanraj Krishnan, Nicola L. Hawley, Tanya J. Major, Jenna C. Carlson, Jennie Harré Hindmarsh, Huti Watson, Muhammad Qasim, Lisa K. Stamp, Nicola Dalbeth, Rinki Murphy, Guangyun Sun, Hong Cheng, Take Naseri, Muagututi'a S. Reupena, Erin E. Kershaw, Ranjan Deka, Stephen T. McGarvey, Ryan L. Minster, Tony R. Merriman, Daniel E. Weeks","doi":"10.1002/gepi.22508","DOIUrl":"10.1002/gepi.22508","url":null,"abstract":"<p>The minor allele of rs373863828, a missense variant in CREB3 Regulatory Factor, is associated with several cardiometabolic phenotypes in Polynesian peoples. To better understand the variant, we tested the association of rs373863828 with a panel of correlated phenotypes (body mass index [BMI], weight, height, HDL cholesterol, triglycerides, and total cholesterol) using multivariate Bayesian association and network analyses in a Samoa cohort (<i>n</i> = 1632), Aotearoa New Zealand cohort (<i>n</i> = 1419), and combined cohort (<i>n</i> = 2976). An expanded set of phenotypes (adding estimated fat and fat-free mass, abdominal circumference, hip circumference, and abdominal-hip ratio) was tested in the Samoa cohort (<i>n</i> = 1496). In the Samoa cohort, we observed significant associations (log<sub>10</sub> Bayes Factor [BF] ≥ 5.0) between rs373863828 and the overall phenotype panel (8.81), weight (8.30), and BMI (6.42). In the Aotearoa New Zealand cohort, we observed suggestive associations (1.5 < log<sub>10</sub>BF < 5) between rs373863828 and the overall phenotype panel (4.60), weight (3.27), and BMI (1.80). In the combined cohort, we observed concordant signals with larger log<sub>10</sub>BFs. In the Samoa-specific expanded phenotype analyses, we also observed significant associations between rs373863828 and fat mass (5.65), abdominal circumference (5.34), and hip circumference (5.09). Bayesian networks provided evidence for a direct association of rs373863828 with weight and indirect associations with height and BMI.</p>","PeriodicalId":12710,"journal":{"name":"Genetic Epidemiology","volume":"47 1","pages":"105-118"},"PeriodicalIF":2.1,"publicationDate":"2022-11-09","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://onlinelibrary.wiley.com/doi/epdf/10.1002/gepi.22508","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"9162994","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"医学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Charles Spanbauer, Wei Pan, ADNI, The Alzheimer's Disease Neuroimaging Initiative
{"title":"Sparse prediction informed by genetic annotations using the logit normal prior for Bayesian regression tree ensembles","authors":"Charles Spanbauer, Wei Pan, ADNI, The Alzheimer's Disease Neuroimaging Initiative","doi":"10.1002/gepi.22505","DOIUrl":"10.1002/gepi.22505","url":null,"abstract":"<p>Using high-dimensional genetic variants such as single nucleotide polymorphisms (SNP) to predict complex diseases and traits has important applications in basic research and other clinical settings. For example, predicting gene expression is a necessary first step to identify (putative) causal genes in transcriptome-wide association studies. Due to weak signals, high-dimensionality, and linkage disequilibrium (correlation) among SNPs, building such a prediction model is challenging. However, functional annotations at the SNP level (e.g., as epigenomic data across multiple cell- or tissue-types) are available and could be used to inform predictor importance and aid in outcome prediction. Existing approaches to incorporate annotations have been based mainly on (generalized) linear models. Bayesian additive regression trees (BART), in contrast, is a reliable method to obtain high-quality nonlinear out of sample predictions without overfitting. Unfortunately, the default prior from BART may be too inflexible to handle sparse situations where the number of predictors approaches or surpasses the number of observations. Motivated by our real data application, this article proposes an alternative prior based on the logit normal distribution because it provides a framework that is adaptive to sparsity and can model informative functional annotations. It also provides a framework to incorporate prior information about the between SNP correlations. Computational details for carrying out inference are presented along with the results from a simulation study and a genome-wide prediction analysis of the Alzheimer's Disease Neuroimaging Initiative data.</p>","PeriodicalId":12710,"journal":{"name":"Genetic Epidemiology","volume":"47 1","pages":"26-44"},"PeriodicalIF":2.1,"publicationDate":"2022-11-09","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://onlinelibrary.wiley.com/doi/epdf/10.1002/gepi.22505","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"9652572","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"医学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Apostolos Gkatzionis, Stephen Burgess, Paul J. Newcombe
{"title":"Statistical methods for cis-Mendelian randomization with two-sample summary-level data","authors":"Apostolos Gkatzionis, Stephen Burgess, Paul J. Newcombe","doi":"10.1002/gepi.22506","DOIUrl":"10.1002/gepi.22506","url":null,"abstract":"<p>Mendelian randomization (MR) is the use of genetic variants to assess the existence of a causal relationship between a risk factor and an outcome of interest. Here, we focus on two-sample summary-data MR analyses with many correlated variants from a single gene region, particularly on <i>cis</i>-MR studies which use protein expression as a risk factor. Such studies must rely on a small, curated set of variants from the studied region; using all variants in the region requires inverting an ill-conditioned genetic correlation matrix and results in numerically unstable causal effect estimates. We review methods for variable selection and estimation in <i>cis</i>-MR with summary-level data, ranging from stepwise pruning and conditional analysis to principal components analysis, factor analysis, and Bayesian variable selection. In a simulation study, we show that the various methods have comparable performance in analyses with large sample sizes and strong genetic instruments. However, when weak instrument bias is suspected, factor analysis and Bayesian variable selection produce more reliable inferences than simple pruning approaches, which are often used in practice. We conclude by examining two case studies, assessing the effects of low-density lipoprotein-cholesterol and serum testosterone on coronary heart disease risk using variants in the <i>HMGCR</i> and <i>SHBG</i> gene regions, respectively.</p>","PeriodicalId":12710,"journal":{"name":"Genetic Epidemiology","volume":"47 1","pages":"3-25"},"PeriodicalIF":2.1,"publicationDate":"2022-10-23","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://onlinelibrary.wiley.com/doi/epdf/10.1002/gepi.22506","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"9297361","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"医学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
John Kidd, Chelsea K. Raulerson, Karen L. Mohlke, Dan-Yu Lin
{"title":"Mediation analysis of multiple mediators with incomplete omics data","authors":"John Kidd, Chelsea K. Raulerson, Karen L. Mohlke, Dan-Yu Lin","doi":"10.1002/gepi.22504","DOIUrl":"10.1002/gepi.22504","url":null,"abstract":"<p>There is an increasing interest in using multiple types of omics features (e.g., DNA sequences, RNA expressions, methylation, protein expressions, and metabolic profiles) to study how the relationships between phenotypes and genotypes may be mediated by other omics markers. Genotypes and phenotypes are typically available for all subjects in genetic studies, but typically, some omics data will be missing for some subjects, due to limitations such as cost and sample quality. In this article, we propose a powerful approach for mediation analysis that accommodates missing data among multiple mediators and allows for various interaction effects. We formulate the relationships among genetic variants, other omics measurements, and phenotypes through linear regression models. We derive the joint likelihood for models with two mediators, accounting for arbitrary patterns of missing values. Utilizing computationally efficient and stable algorithms, we conduct maximum likelihood estimation. Our methods produce unbiased and statistically efficient estimators. We demonstrate the usefulness of our methods through simulation studies and an application to the Metabolic Syndrome in Men study.</p>","PeriodicalId":12710,"journal":{"name":"Genetic Epidemiology","volume":"47 1","pages":"61-77"},"PeriodicalIF":2.1,"publicationDate":"2022-09-20","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC10423053/pdf/nihms-1913096.pdf","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"9991045","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"医学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}