Christoffer Dharma, Peter Smith, Travis Salway, Dionne Gesink, Michael Escobar, Victoria Landsman
{"title":"从难以到达的人群中同时纠正非概率样本的选择和错误分类偏差的两步方法。","authors":"Christoffer Dharma, Peter Smith, Travis Salway, Dionne Gesink, Michael Escobar, Victoria Landsman","doi":"10.1093/aje/kwaf132","DOIUrl":null,"url":null,"abstract":"<p><p>Researchers studying hard-to-reach or minority populations are increasingly implementing nonprobability sampling strategies which are often prone to selection bias. To address this problem, existing statistical methods suggest integrating data from external probability sample, often collected by government agencies, with the nonprobability sample from the hard-to-reach population. These methods assume that all information collected in the probability sample is recorded without errors. This may not be the case if participants are unwilling to report their minority status, such as sexual orientation, truthfully in large-scale population-based surveys, leading to misclassification bias. In this paper, we propose a novel two-step approach aimed at addressing misclassification bias in the probability sample to improve the performance of the data integration methods aimed at addressing selection bias in the nonprobability sample. By applying the proposed method to simulated data, we demonstrate a significant reduction in bias and validate the proposed bootstrap variance estimator of the estimated mean (prevalence) under low, moderate, and high misclassification rates. This method is particularly beneficial when the misclassification rate is high. Finally, we illustrate the application of the two-step approach to estimate the prevalence of measures of social connectedness among sexual minority men using a real-world nonprobability sample.</p>","PeriodicalId":7472,"journal":{"name":"American journal of epidemiology","volume":" ","pages":""},"PeriodicalIF":5.0000,"publicationDate":"2025-06-20","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":"{\"title\":\"A two-step approach to simultaneously correct for selection and misclassification bias in nonprobability samples from hard-to-reach populations.\",\"authors\":\"Christoffer Dharma, Peter Smith, Travis Salway, Dionne Gesink, Michael Escobar, Victoria Landsman\",\"doi\":\"10.1093/aje/kwaf132\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"<p><p>Researchers studying hard-to-reach or minority populations are increasingly implementing nonprobability sampling strategies which are often prone to selection bias. To address this problem, existing statistical methods suggest integrating data from external probability sample, often collected by government agencies, with the nonprobability sample from the hard-to-reach population. These methods assume that all information collected in the probability sample is recorded without errors. This may not be the case if participants are unwilling to report their minority status, such as sexual orientation, truthfully in large-scale population-based surveys, leading to misclassification bias. In this paper, we propose a novel two-step approach aimed at addressing misclassification bias in the probability sample to improve the performance of the data integration methods aimed at addressing selection bias in the nonprobability sample. By applying the proposed method to simulated data, we demonstrate a significant reduction in bias and validate the proposed bootstrap variance estimator of the estimated mean (prevalence) under low, moderate, and high misclassification rates. This method is particularly beneficial when the misclassification rate is high. Finally, we illustrate the application of the two-step approach to estimate the prevalence of measures of social connectedness among sexual minority men using a real-world nonprobability sample.</p>\",\"PeriodicalId\":7472,\"journal\":{\"name\":\"American journal of epidemiology\",\"volume\":\" \",\"pages\":\"\"},\"PeriodicalIF\":5.0000,\"publicationDate\":\"2025-06-20\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"0\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"American journal of epidemiology\",\"FirstCategoryId\":\"3\",\"ListUrlMain\":\"https://doi.org/10.1093/aje/kwaf132\",\"RegionNum\":2,\"RegionCategory\":\"医学\",\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"Q1\",\"JCRName\":\"PUBLIC, ENVIRONMENTAL & OCCUPATIONAL HEALTH\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"American journal of epidemiology","FirstCategoryId":"3","ListUrlMain":"https://doi.org/10.1093/aje/kwaf132","RegionNum":2,"RegionCategory":"医学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q1","JCRName":"PUBLIC, ENVIRONMENTAL & OCCUPATIONAL HEALTH","Score":null,"Total":0}
A two-step approach to simultaneously correct for selection and misclassification bias in nonprobability samples from hard-to-reach populations.
Researchers studying hard-to-reach or minority populations are increasingly implementing nonprobability sampling strategies which are often prone to selection bias. To address this problem, existing statistical methods suggest integrating data from external probability sample, often collected by government agencies, with the nonprobability sample from the hard-to-reach population. These methods assume that all information collected in the probability sample is recorded without errors. This may not be the case if participants are unwilling to report their minority status, such as sexual orientation, truthfully in large-scale population-based surveys, leading to misclassification bias. In this paper, we propose a novel two-step approach aimed at addressing misclassification bias in the probability sample to improve the performance of the data integration methods aimed at addressing selection bias in the nonprobability sample. By applying the proposed method to simulated data, we demonstrate a significant reduction in bias and validate the proposed bootstrap variance estimator of the estimated mean (prevalence) under low, moderate, and high misclassification rates. This method is particularly beneficial when the misclassification rate is high. Finally, we illustrate the application of the two-step approach to estimate the prevalence of measures of social connectedness among sexual minority men using a real-world nonprobability sample.
期刊介绍:
The American Journal of Epidemiology is the oldest and one of the premier epidemiologic journals devoted to the publication of empirical research findings, opinion pieces, and methodological developments in the field of epidemiologic research.
It is a peer-reviewed journal aimed at both fellow epidemiologists and those who use epidemiologic data, including public health workers and clinicians.