{"title":"Bounds for selection bias using outcome probabilities","authors":"Stina Zetterstrom","doi":"10.1515/em-2023-0033","DOIUrl":null,"url":null,"abstract":"\n \n \n Determining the causal relationship between exposure and outcome is the goal of many observational studies. However, the selection of subjects into the study population, either voluntary or involuntary, may result in estimates that suffer from selection bias. To assess the robustness of the estimates as well as the magnitude of the bias, bounds for the bias can be calculated. Previous bounds for selection bias often require the specification of unknown relative risks, which might be difficult to provide. Here, alternative bounds based on observed data and unknown outcome probabilities are proposed. These unknown probabilities may be easier to specify than unknown relative risks.\n \n \n \n I derive alternative bounds from the definitions of the causal estimands using the potential outcomes framework, under specific assumptions. The bounds are expressed using observed data and unobserved outcome probabilities. The bounds are compared to previously reported bounds in a simulation study. Furthermore, a study of perinatal risk factors for type 1 diabetes is provided as a motivating example.\n \n \n \n I show that the proposed bounds are often informative when the exposure and outcome are sufficiently common, especially for the risk difference in the total population. It is also noted that the proposed bounds can be uninformative when the exposure and outcome are rare. Furthermore, it is noted that previously proposed assumption-free bounds are special cases of the new bounds when the sensitivity parameters are set to their most conservative values.\n \n \n \n Depending on the data generating process and causal estimand of interest, the proposed bounds can be tighter or wider than the reference bounds. Importantly, in cases with sufficiently common outcome and exposure, the proposed bounds are often informative, especially for the risk difference in the total population. It is also noted that, in some cases, the new bounds can be wider than the reference bounds. However, the proposed bounds based on unobserved probabilities may in some cases be easier to specify than the reference bounds based on unknown relative risks.\n","PeriodicalId":37999,"journal":{"name":"Epidemiologic Methods","volume":"128 5-6","pages":""},"PeriodicalIF":0.0000,"publicationDate":"2024-01-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Epidemiologic Methods","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1515/em-2023-0033","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q3","JCRName":"Mathematics","Score":null,"Total":0}
引用次数: 0
Abstract
Determining the causal relationship between exposure and outcome is the goal of many observational studies. However, the selection of subjects into the study population, either voluntary or involuntary, may result in estimates that suffer from selection bias. To assess the robustness of the estimates as well as the magnitude of the bias, bounds for the bias can be calculated. Previous bounds for selection bias often require the specification of unknown relative risks, which might be difficult to provide. Here, alternative bounds based on observed data and unknown outcome probabilities are proposed. These unknown probabilities may be easier to specify than unknown relative risks.
I derive alternative bounds from the definitions of the causal estimands using the potential outcomes framework, under specific assumptions. The bounds are expressed using observed data and unobserved outcome probabilities. The bounds are compared to previously reported bounds in a simulation study. Furthermore, a study of perinatal risk factors for type 1 diabetes is provided as a motivating example.
I show that the proposed bounds are often informative when the exposure and outcome are sufficiently common, especially for the risk difference in the total population. It is also noted that the proposed bounds can be uninformative when the exposure and outcome are rare. Furthermore, it is noted that previously proposed assumption-free bounds are special cases of the new bounds when the sensitivity parameters are set to their most conservative values.
Depending on the data generating process and causal estimand of interest, the proposed bounds can be tighter or wider than the reference bounds. Importantly, in cases with sufficiently common outcome and exposure, the proposed bounds are often informative, especially for the risk difference in the total population. It is also noted that, in some cases, the new bounds can be wider than the reference bounds. However, the proposed bounds based on unobserved probabilities may in some cases be easier to specify than the reference bounds based on unknown relative risks.
期刊介绍:
Epidemiologic Methods (EM) seeks contributions comparable to those of the leading epidemiologic journals, but also invites papers that may be more technical or of greater length than what has traditionally been allowed by journals in epidemiology. Applications and examples with real data to illustrate methodology are strongly encouraged but not required. Topics. genetic epidemiology, infectious disease, pharmaco-epidemiology, ecologic studies, environmental exposures, screening, surveillance, social networks, comparative effectiveness, statistical modeling, causal inference, measurement error, study design, meta-analysis