Selection bias and multiple inclusion criteria in observational studies

Q3 Mathematics

Epidemiologic Methods Pub Date : 2022-01-01 DOI:10.1515/em-2022-0108

Stina Zetterstrom, I. Waernbaum

{"title":"Selection bias and multiple inclusion criteria in observational studies","authors":"Stina Zetterstrom, I. Waernbaum","doi":"10.1515/em-2022-0108","DOIUrl":null,"url":null,"abstract":"Abstract Objectives Spurious associations between an exposure and outcome not describing the causal estimand of interest can be the result of selection of the study population. Recently, sensitivity parameters and bounds have been proposed for selection bias, along the lines of sensitivity analysis previously proposed for bias due to unmeasured confounding. The basis for the bounds is that the researcher specifies values for sensitivity parameters describing associations under additional identifying assumptions. The sensitivity parameters describe aspects of the joint distribution of the outcome, the selection and a vector of unmeasured variables, for each treatment group respectively. In practice, selection of a study population is often made on the basis of several selection criteria, thereby affecting the proposed bounds. Methods We extend the previously proposed bounds to give additional guidance for practitioners to construct i) the sensitivity parameters for multiple selection variables and ii) an alternative assumption free bound, producing only logically feasible values. As a motivating example we derive the bounds for causal estimands in a study of perinatal risk factors for childhood onset Type 1 Diabetes Mellitus where selection of the study population was made by multiple inclusion criteria. To give further guidance for practitioners, we provide a data learner in R where both the sensitivity parameters and the assumption-free bounds are implemented. Results The assumption-free bounds can be both smaller and larger than the previously proposed bounds and can serve as an indicator of settings when the former bounds do not produce feasible values. The motivating example shows that the assumption-free bounds may not be appropriate when the outcome or treatment is rare. Conclusions Bounds can provide guidance in a sensitivity analysis to assess the magnitude of selection bias. Additional knowledge is used to produce values for sensitivity parameters under multiple selection criteria. The computation of values for the sensitivity parameters is complicated by the multiple inclusion/exclusion criteria, and a data learner in R is provided to facilitate their construction. For comparison and assessment of the feasibility of the bound an assumption free bound is provided using solely underlying assumptions in the framework of potential outcomes.","PeriodicalId":37999,"journal":{"name":"Epidemiologic Methods","volume":"231 1","pages":""},"PeriodicalIF":0.0000,"publicationDate":"2022-01-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"2","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Epidemiologic Methods","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1515/em-2022-0108","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q3","JCRName":"Mathematics","Score":null,"Total":0}

引用次数: 2

Abstract

Abstract Objectives Spurious associations between an exposure and outcome not describing the causal estimand of interest can be the result of selection of the study population. Recently, sensitivity parameters and bounds have been proposed for selection bias, along the lines of sensitivity analysis previously proposed for bias due to unmeasured confounding. The basis for the bounds is that the researcher specifies values for sensitivity parameters describing associations under additional identifying assumptions. The sensitivity parameters describe aspects of the joint distribution of the outcome, the selection and a vector of unmeasured variables, for each treatment group respectively. In practice, selection of a study population is often made on the basis of several selection criteria, thereby affecting the proposed bounds. Methods We extend the previously proposed bounds to give additional guidance for practitioners to construct i) the sensitivity parameters for multiple selection variables and ii) an alternative assumption free bound, producing only logically feasible values. As a motivating example we derive the bounds for causal estimands in a study of perinatal risk factors for childhood onset Type 1 Diabetes Mellitus where selection of the study population was made by multiple inclusion criteria. To give further guidance for practitioners, we provide a data learner in R where both the sensitivity parameters and the assumption-free bounds are implemented. Results The assumption-free bounds can be both smaller and larger than the previously proposed bounds and can serve as an indicator of settings when the former bounds do not produce feasible values. The motivating example shows that the assumption-free bounds may not be appropriate when the outcome or treatment is rare. Conclusions Bounds can provide guidance in a sensitivity analysis to assess the magnitude of selection bias. Additional knowledge is used to produce values for sensitivity parameters under multiple selection criteria. The computation of values for the sensitivity parameters is complicated by the multiple inclusion/exclusion criteria, and a data learner in R is provided to facilitate their construction. For comparison and assessment of the feasibility of the bound an assumption free bound is provided using solely underlying assumptions in the framework of potential outcomes.

查看原文本刊更多论文

观察性研究中的选择偏倚和多重纳入标准

研究对象的选择可能导致暴露和结果之间的虚假关联，而不是描述感兴趣的因果估计。最近，针对选择偏倚提出了敏感性参数和界限，这与之前针对未测量混杂引起的偏倚提出的敏感性分析是一致的。边界的基础是研究人员指定了在附加识别假设下描述关联的敏感性参数的值。敏感性参数分别描述了每个治疗组的结果联合分布、未测量变量的选择和向量的各个方面。在实践中，研究人群的选择通常是根据几个选择标准进行的，从而影响了建议的界限。我们扩展了先前提出的边界，为从业者提供额外的指导，以构建i)多个选择变量的敏感性参数和ii)一个替代假设自由边界，只产生逻辑上可行的值。作为一个有启发性的例子，我们在一项关于儿童发病1型糖尿病围产期危险因素的研究中推导出因果估计的界限，其中研究人群的选择是通过多个纳入标准进行的。为了给从业者提供进一步的指导，我们在R中提供了一个数据学习器，其中实现了灵敏度参数和无假设边界。结果无假设边界可以比先前提出的边界更小或更大，并且可以在先前的边界不能产生可行值时作为设置的指标。激励的例子表明，当结果或治疗罕见时，无假设边界可能不合适。结论界限可以为敏感性分析评估选择偏倚的程度提供指导。额外的知识用于在多个选择标准下产生灵敏度参数的值。灵敏度参数值的计算因多个纳入/排除标准而变得复杂，在R中提供了一个数据学习器来方便它们的构建。为了比较和评估边界的可行性，在潜在结果的框架中仅使用基本假设提供了假设自由边界。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

求助全文

约1分钟内获得全文求助全文

来源期刊

Epidemiologic Methods Mathematics-Applied Mathematics

CiteScore

2.10

自引率

0.00%

发文量

期刊介绍： Epidemiologic Methods (EM) seeks contributions comparable to those of the leading epidemiologic journals, but also invites papers that may be more technical or of greater length than what has traditionally been allowed by journals in epidemiology. Applications and examples with real data to illustrate methodology are strongly encouraged but not required. Topics. genetic epidemiology, infectious disease, pharmaco-epidemiology, ecologic studies, environmental exposures, screening, surveillance, social networks, comparative effectiveness, statistical modeling, causal inference, measurement error, study design, meta-analysis