Sharon M. Lutz, Kirsten Voorhies, John E. Hokanson, Stijn Vansteelandt, Christoph Lange
{"title":"The Importance of Sensitivity Analyses for the MR Steiger Approach","authors":"Sharon M. Lutz, Kirsten Voorhies, John E. Hokanson, Stijn Vansteelandt, Christoph Lange","doi":"10.1002/gepi.70018","DOIUrl":null,"url":null,"abstract":"<p>An extension to Mendelian randomization (MR), MR Steiger uses single nucleotide polymorphisms (SNPs) in an instrumental variables framework to infer the causal direction between two phenotypes (Hemani et al. <span>2017</span>). In 2021 and 2022, we explored the role of unmeasured confounding, pleiotropy, and measurement error on the performance of the MR Steiger approach (Lutz et al. <span>2021</span>) as well as selection bias (Lutz et al. <span>2022a</span>). In 2022, we used simulation studies to further examine the role of unmeasured confounding on the general performance of the MR Steiger approach to show that unmeasured confounding can increase the variance of phenotype 1 as compared to phenotype 2 such that the wrong causal direction between the two phenotypes will be inferred by the approach. We moreover created an R package UCRMS to reproduce these simulation studies (Lutz et al. <span>2022b</span>). However, in a 2023 paper by Hemani at el., the authors incorrectly stated that “Lutz et al. (2022) propose an R package (UCRMS) for performing sensitivity analysis of the MR Steiger method” (Hemani et al. <span>2023</span>), where a sensitivity analysis examines how different values of an independent variable affect a dependent variable under a given set of assumptions. The purpose of our R package (UCRMS) was to examine the general performance of the MR Steiger approach in the presence of unmeasured confounding, not as a package for sensitivity analyses. In the 2023 paper by Hemani et al. they state that “If [Lutz et al.] were presenting a simulation of the general performance of MR Steiger under unmeasured confounding then it would not matter that the simulated parameters are not tied to those observed in a particular empirical analysis” (Hemani et al. <span>2023</span>), illustrating the correct original purpose of our R package as a simulation to assess the performance of the MR Steiger approach and not as a sensitivity analysis.</p><p>Here, <span></span><math>\n <semantics>\n <mrow>\n <mrow>\n <msub>\n <mi>β</mi>\n <mi>OLS</mi>\n </msub>\n </mrow>\n </mrow>\n <annotation> ${\\beta }_{{OLS}}$</annotation>\n </semantics></math> is the “observed effect” of phenotype X on phenotype Y, which may differ from the true effect <span></span><math>\n <semantics>\n <mrow>\n <mrow>\n <msub>\n <mi>β</mi>\n <mi>xy</mi>\n </msub>\n </mrow>\n </mrow>\n <annotation> ${\\beta }_{{xy}}$</annotation>\n </semantics></math> as a result of confounding by U.</p><p>As stated by the Hemani et al. estimates of <span></span><math>\n <semantics>\n <mrow>\n <mrow>\n <msub>\n <mi>β</mi>\n <mi>xy</mi>\n </msub>\n </mrow>\n </mrow>\n <annotation> ${\\beta }_{{xy}}$</annotation>\n </semantics></math> and <span></span><math>\n <semantics>\n <mrow>\n <mrow>\n <msub>\n <mi>β</mi>\n <mi>OLS</mi>\n </msub>\n </mrow>\n </mrow>\n <annotation> ${\\beta }_{{OLS}}$</annotation>\n </semantics></math> are needed since the true effect of phenotype X on phenotype Y given the unmeasured confounder U (i.e., <span></span><math>\n <semantics>\n <mrow>\n <mrow>\n <msub>\n <mi>β</mi>\n <mi>xy</mi>\n </msub>\n </mrow>\n </mrow>\n <annotation> ${\\beta }_{{xy}}$</annotation>\n </semantics></math>) and the observed effect of X on Y not accounting for the unmeasured confounder (i.e., <span></span><math>\n <semantics>\n <mrow>\n <mrow>\n <msub>\n <mi>β</mi>\n <mi>OLS</mi>\n </msub>\n </mrow>\n </mrow>\n <annotation> ${\\beta }_{{OLS}}$</annotation>\n </semantics></math>) are both unknown. As stated in the supplement of Hemani et al. <span>2023</span>, if the difference between <span></span><math>\n <semantics>\n <mrow>\n <mrow>\n <msub>\n <mi>β</mi>\n <mi>xy</mi>\n </msub>\n </mrow>\n </mrow>\n <annotation> ${\\beta }_{{xy}}$</annotation>\n </semantics></math> and <span></span><math>\n <semantics>\n <mrow>\n <mrow>\n <msub>\n <mi>β</mi>\n <mi>OLS</mi>\n </msub>\n </mrow>\n </mrow>\n <annotation> ${\\beta }_{{OLS}}$</annotation>\n </semantics></math> is large, then the wrong causal direction can be inferred. This is especially true if additionally, the variance of phenotype X is larger than the variance of phenotype Y. Therefore, it is very important that the estimate of <span></span><math>\n <semantics>\n <mrow>\n <mrow>\n <msub>\n <mi>β</mi>\n <mi>xy</mi>\n </msub>\n </mrow>\n </mrow>\n <annotation> ${\\beta }_{{xy}}$</annotation>\n </semantics></math> is close to the true value of <span></span><math>\n <semantics>\n <mrow>\n <mrow>\n <msub>\n <mi>β</mi>\n <mi>xy</mi>\n </msub>\n </mrow>\n </mrow>\n <annotation> ${\\beta }_{{xy}}$</annotation>\n </semantics></math>. By estimating <span></span><math>\n <semantics>\n <mrow>\n <mrow>\n <msub>\n <mi>β</mi>\n <mi>xy</mi>\n </msub>\n </mrow>\n </mrow>\n <annotation> ${\\beta }_{{xy}}$</annotation>\n </semantics></math> using MR, as in the data analysis example of the supplement of the Hemani et al. paper, one implicitly requires all assumptions for this MR approach to be met. If these assumptions are not met or there is large sampling variability such that the estimate of <span></span><math>\n <semantics>\n <mrow>\n <mrow>\n <msub>\n <mi>β</mi>\n <mi>xy</mi>\n </msub>\n </mrow>\n </mrow>\n <annotation> ${\\beta }_{{xy}}$</annotation>\n </semantics></math> differs substantially from the true value of <span></span><math>\n <semantics>\n <mrow>\n <mrow>\n <msub>\n <mi>β</mi>\n <mi>xy</mi>\n </msub>\n </mrow>\n </mrow>\n <annotation> ${\\beta }_{{xy}}$</annotation>\n </semantics></math>, then the sensitivity analysis may under- or overestimate the effect of unmeasured confounding, which may change the probability of the correct direction being inferred.</p><p>In view of this, we are concerned about the choice of parameters used in the sensitivity analysis for the data analysis in the supplement (Hemani et al. <span>2023</span>), which explores the role of unmeasured confounding on the MR Steiger approach to infer the effect direction of body mass index (BMI) and systolic blood pressure (SBP). In Hemani et al.'s analysis, the estimated effects of SNPs on BMI are obtained for the UK Biobank among participants of European ancestry. The true effect of BMI on SBP (i.e., <span></span><math>\n <semantics>\n <mrow>\n <mrow>\n <msub>\n <mi>β</mi>\n <mi>xy</mi>\n </msub>\n </mrow>\n </mrow>\n <annotation> ${\\beta }_{{xy}}$</annotation>\n </semantics></math>) is estimated using MR IVW in the UK Biobank among participants of European ancestry. The variances were set to 1 for phenotype X (i.e., BMI), phenotype Y (i.e., SBP), the SNPs (i.e., G), and the unmeasured confounder U. However, Hemani et al. used the observed effect of BMI on SBP (i.e., <span></span><math>\n <semantics>\n <mrow>\n <mrow>\n <msub>\n <mi>β</mi>\n <mi>OLS</mi>\n </msub>\n </mrow>\n </mrow>\n <annotation> ${\\beta }_{{OLS}}$</annotation>\n </semantics></math>) from a study of 1.7 million Chinese adults that examined the relationship between BMI and blood pressure (Linderman et al. <span>2018</span>). While Hemani et al. show that the correct direction is inferred 98% of the time for the sensitivity analysis, this proportion would potentially decrease if the observed effect of BMI on SBP for the Chinese population (i.e., <span></span><math>\n <semantics>\n <mrow>\n <mrow>\n <msub>\n <mi>β</mi>\n <mi>OLS</mi>\n </msub>\n </mrow>\n </mrow>\n <annotation> ${\\beta }_{{OLS}}$</annotation>\n </semantics></math>) had a larger difference with the true causal effect of BMI on SBP for the Chinese population (i.e., <span></span><math>\n <semantics>\n <mrow>\n <mrow>\n <msub>\n <mi>β</mi>\n <mi>xy</mi>\n </msub>\n </mrow>\n </mrow>\n <annotation> ${\\beta }_{{xy}}$</annotation>\n </semantics></math>). It is hence unclear whether the true effect of BMI on SBP for the Chinese population (i.e., <span></span><math>\n <semantics>\n <mrow>\n <mrow>\n <msub>\n <mi>β</mi>\n <mi>xy</mi>\n </msub>\n </mrow>\n </mrow>\n <annotation> ${\\beta }_{{xy}}$</annotation>\n </semantics></math>) can be assumed to equal the estimated effect of BMI on SBP in the UK Biobank, given the substantial difference between both populations, e.g. diet, life-style factors, environmental exposures, etc. In addition, the estimated value of <span></span><math>\n <semantics>\n <mrow>\n <mrow>\n <msub>\n <mi>β</mi>\n <mi>xy</mi>\n </msub>\n </mrow>\n </mrow>\n <annotation> ${\\beta }_{{xy}}$</annotation>\n </semantics></math> using MR IVW in the UK Biobank can potentially differ from the true value because of the IV assumptions being violated (i.e., as a result of ignoring the possibly more complex underlying longitudinal structure where feedback relations may exist) or because of large sampling variability.</p><p>Furthermore, the proposed sensitivity analysis by Hemani et al. does not account for known confounders of phenotype X and phenotype Y. While most MR approaches are robust to confounding between the two phenotypes, the MR Steiger approach is not. Therefore, it is unclear how the sensitivity analysis accounts for confounders of phenotype X and phenotype Y while examining the role of a single unmeasured confounder. For example, while the data analysis by Hemani et al. focuses on the effect of BMI on SBP in the overall sample, several studies examining the effect of BMI on SBP stratify by sex (Adler et al. <span>2015</span>; Cox et al. <span>1997</span>; Chen et al. <span>2018</span>; Li et al. <span>2015</span>; Dua et al. <span>2014</span>). Also, note that smoking rates differ substantially by sex. In the UK in 2022, 12.9% of the population was categorized as current smokers (14.6% male and 11.2% female) (Office for National Statistics ONS <span>2023</span>). In China in 2018, one study reported that 2% of women smoked while 50% of men smoked (Chan et al. <span>2023</span>). Since both smoking and sex effect BMI and SBP, it is unclear how the sensitivity analysis for this data analysis accounts for the effect of sex and smoking while examining the role of unmeasured confounding. It would be beneficial for the analyst if the sensitivity analysis presented by Hemani et al. allowed for users to specify the effect of known confounders while examining the effect of unmeasured confounding for the MR Steiger approach.</p><p>Research reported in this publication was supported by the National Institute of Mental Health under Award Number R01MH129337. This study was supported by NHLBI R01MH129337.</p><p>The authors declare no conflicts of interest.</p>","PeriodicalId":12710,"journal":{"name":"Genetic Epidemiology","volume":"49 7","pages":""},"PeriodicalIF":3.8000,"publicationDate":"2025-09-04","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://onlinelibrary.wiley.com/doi/epdf/10.1002/gepi.70018","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Genetic Epidemiology","FirstCategoryId":"3","ListUrlMain":"https://onlinelibrary.wiley.com/doi/10.1002/gepi.70018","RegionNum":4,"RegionCategory":"医学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q3","JCRName":"GENETICS & HEREDITY","Score":null,"Total":0}
引用次数: 0
Abstract
An extension to Mendelian randomization (MR), MR Steiger uses single nucleotide polymorphisms (SNPs) in an instrumental variables framework to infer the causal direction between two phenotypes (Hemani et al. 2017). In 2021 and 2022, we explored the role of unmeasured confounding, pleiotropy, and measurement error on the performance of the MR Steiger approach (Lutz et al. 2021) as well as selection bias (Lutz et al. 2022a). In 2022, we used simulation studies to further examine the role of unmeasured confounding on the general performance of the MR Steiger approach to show that unmeasured confounding can increase the variance of phenotype 1 as compared to phenotype 2 such that the wrong causal direction between the two phenotypes will be inferred by the approach. We moreover created an R package UCRMS to reproduce these simulation studies (Lutz et al. 2022b). However, in a 2023 paper by Hemani at el., the authors incorrectly stated that “Lutz et al. (2022) propose an R package (UCRMS) for performing sensitivity analysis of the MR Steiger method” (Hemani et al. 2023), where a sensitivity analysis examines how different values of an independent variable affect a dependent variable under a given set of assumptions. The purpose of our R package (UCRMS) was to examine the general performance of the MR Steiger approach in the presence of unmeasured confounding, not as a package for sensitivity analyses. In the 2023 paper by Hemani et al. they state that “If [Lutz et al.] were presenting a simulation of the general performance of MR Steiger under unmeasured confounding then it would not matter that the simulated parameters are not tied to those observed in a particular empirical analysis” (Hemani et al. 2023), illustrating the correct original purpose of our R package as a simulation to assess the performance of the MR Steiger approach and not as a sensitivity analysis.
Here, is the “observed effect” of phenotype X on phenotype Y, which may differ from the true effect as a result of confounding by U.
As stated by the Hemani et al. estimates of and are needed since the true effect of phenotype X on phenotype Y given the unmeasured confounder U (i.e., ) and the observed effect of X on Y not accounting for the unmeasured confounder (i.e., ) are both unknown. As stated in the supplement of Hemani et al. 2023, if the difference between and is large, then the wrong causal direction can be inferred. This is especially true if additionally, the variance of phenotype X is larger than the variance of phenotype Y. Therefore, it is very important that the estimate of is close to the true value of . By estimating using MR, as in the data analysis example of the supplement of the Hemani et al. paper, one implicitly requires all assumptions for this MR approach to be met. If these assumptions are not met or there is large sampling variability such that the estimate of differs substantially from the true value of , then the sensitivity analysis may under- or overestimate the effect of unmeasured confounding, which may change the probability of the correct direction being inferred.
In view of this, we are concerned about the choice of parameters used in the sensitivity analysis for the data analysis in the supplement (Hemani et al. 2023), which explores the role of unmeasured confounding on the MR Steiger approach to infer the effect direction of body mass index (BMI) and systolic blood pressure (SBP). In Hemani et al.'s analysis, the estimated effects of SNPs on BMI are obtained for the UK Biobank among participants of European ancestry. The true effect of BMI on SBP (i.e., ) is estimated using MR IVW in the UK Biobank among participants of European ancestry. The variances were set to 1 for phenotype X (i.e., BMI), phenotype Y (i.e., SBP), the SNPs (i.e., G), and the unmeasured confounder U. However, Hemani et al. used the observed effect of BMI on SBP (i.e., ) from a study of 1.7 million Chinese adults that examined the relationship between BMI and blood pressure (Linderman et al. 2018). While Hemani et al. show that the correct direction is inferred 98% of the time for the sensitivity analysis, this proportion would potentially decrease if the observed effect of BMI on SBP for the Chinese population (i.e., ) had a larger difference with the true causal effect of BMI on SBP for the Chinese population (i.e., ). It is hence unclear whether the true effect of BMI on SBP for the Chinese population (i.e., ) can be assumed to equal the estimated effect of BMI on SBP in the UK Biobank, given the substantial difference between both populations, e.g. diet, life-style factors, environmental exposures, etc. In addition, the estimated value of using MR IVW in the UK Biobank can potentially differ from the true value because of the IV assumptions being violated (i.e., as a result of ignoring the possibly more complex underlying longitudinal structure where feedback relations may exist) or because of large sampling variability.
Furthermore, the proposed sensitivity analysis by Hemani et al. does not account for known confounders of phenotype X and phenotype Y. While most MR approaches are robust to confounding between the two phenotypes, the MR Steiger approach is not. Therefore, it is unclear how the sensitivity analysis accounts for confounders of phenotype X and phenotype Y while examining the role of a single unmeasured confounder. For example, while the data analysis by Hemani et al. focuses on the effect of BMI on SBP in the overall sample, several studies examining the effect of BMI on SBP stratify by sex (Adler et al. 2015; Cox et al. 1997; Chen et al. 2018; Li et al. 2015; Dua et al. 2014). Also, note that smoking rates differ substantially by sex. In the UK in 2022, 12.9% of the population was categorized as current smokers (14.6% male and 11.2% female) (Office for National Statistics ONS 2023). In China in 2018, one study reported that 2% of women smoked while 50% of men smoked (Chan et al. 2023). Since both smoking and sex effect BMI and SBP, it is unclear how the sensitivity analysis for this data analysis accounts for the effect of sex and smoking while examining the role of unmeasured confounding. It would be beneficial for the analyst if the sensitivity analysis presented by Hemani et al. allowed for users to specify the effect of known confounders while examining the effect of unmeasured confounding for the MR Steiger approach.
Research reported in this publication was supported by the National Institute of Mental Health under Award Number R01MH129337. This study was supported by NHLBI R01MH129337.
期刊介绍:
Genetic Epidemiology is a peer-reviewed journal for discussion of research on the genetic causes of the distribution of human traits in families and populations. Emphasis is placed on the relative contribution of genetic and environmental factors to human disease as revealed by genetic, epidemiological, and biologic investigations.
Genetic Epidemiology primarily publishes papers in statistical genetics, a research field that is primarily concerned with development of statistical, bioinformatical, and computational models for analyzing genetic data. Incorporation of underlying biology and population genetics into conceptual models is favored. The Journal seeks original articles comprising either applied research or innovative statistical, mathematical, computational, or genomic methodologies that advance studies in genetic epidemiology. Other types of reports are encouraged, such as letters to the editor, topic reviews, and perspectives from other fields of research that will likely enrich the field of genetic epidemiology.