A two-step approach to simultaneously correct for selection and misclassification bias in nonprobability samples from hard-to-reach populations.

IF 5 2区 医学 Q1 PUBLIC, ENVIRONMENTAL & OCCUPATIONAL HEALTH
Christoffer Dharma, Peter Smith, Travis Salway, Dionne Gesink, Michael Escobar, Victoria Landsman
{"title":"A two-step approach to simultaneously correct for selection and misclassification bias in nonprobability samples from hard-to-reach populations.","authors":"Christoffer Dharma, Peter Smith, Travis Salway, Dionne Gesink, Michael Escobar, Victoria Landsman","doi":"10.1093/aje/kwaf132","DOIUrl":null,"url":null,"abstract":"<p><p>Researchers studying hard-to-reach or minority populations are increasingly implementing nonprobability sampling strategies which are often prone to selection bias. To address this problem, existing statistical methods suggest integrating data from external probability sample, often collected by government agencies, with the nonprobability sample from the hard-to-reach population. These methods assume that all information collected in the probability sample is recorded without errors. This may not be the case if participants are unwilling to report their minority status, such as sexual orientation, truthfully in large-scale population-based surveys, leading to misclassification bias. In this paper, we propose a novel two-step approach aimed at addressing misclassification bias in the probability sample to improve the performance of the data integration methods aimed at addressing selection bias in the nonprobability sample. By applying the proposed method to simulated data, we demonstrate a significant reduction in bias and validate the proposed bootstrap variance estimator of the estimated mean (prevalence) under low, moderate, and high misclassification rates. This method is particularly beneficial when the misclassification rate is high. Finally, we illustrate the application of the two-step approach to estimate the prevalence of measures of social connectedness among sexual minority men using a real-world nonprobability sample.</p>","PeriodicalId":7472,"journal":{"name":"American journal of epidemiology","volume":" ","pages":""},"PeriodicalIF":5.0000,"publicationDate":"2025-06-20","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"American journal of epidemiology","FirstCategoryId":"3","ListUrlMain":"https://doi.org/10.1093/aje/kwaf132","RegionNum":2,"RegionCategory":"医学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q1","JCRName":"PUBLIC, ENVIRONMENTAL & OCCUPATIONAL HEALTH","Score":null,"Total":0}
引用次数: 0

Abstract

Researchers studying hard-to-reach or minority populations are increasingly implementing nonprobability sampling strategies which are often prone to selection bias. To address this problem, existing statistical methods suggest integrating data from external probability sample, often collected by government agencies, with the nonprobability sample from the hard-to-reach population. These methods assume that all information collected in the probability sample is recorded without errors. This may not be the case if participants are unwilling to report their minority status, such as sexual orientation, truthfully in large-scale population-based surveys, leading to misclassification bias. In this paper, we propose a novel two-step approach aimed at addressing misclassification bias in the probability sample to improve the performance of the data integration methods aimed at addressing selection bias in the nonprobability sample. By applying the proposed method to simulated data, we demonstrate a significant reduction in bias and validate the proposed bootstrap variance estimator of the estimated mean (prevalence) under low, moderate, and high misclassification rates. This method is particularly beneficial when the misclassification rate is high. Finally, we illustrate the application of the two-step approach to estimate the prevalence of measures of social connectedness among sexual minority men using a real-world nonprobability sample.

从难以到达的人群中同时纠正非概率样本的选择和错误分类偏差的两步方法。
研究难以接触或少数群体的研究人员越来越多地采用非概率抽样策略,这往往容易产生选择偏差。为了解决这个问题,现有的统计方法建议将来自外部概率样本(通常由政府机构收集)的数据与来自难以接触的人群的非概率样本相结合。这些方法假定在概率样本中收集的所有信息都记录无误。在大规模的人口调查中,如果参与者不愿意如实报告他们的少数民族身份,比如性取向,就可能不会出现这种情况,从而导致错误的分类偏见。在本文中,我们提出了一种新的两步方法,旨在解决概率样本中的错误分类偏差,以提高旨在解决非概率样本中选择偏差的数据集成方法的性能。通过将所提出的方法应用于模拟数据,我们证明了偏差的显著减少,并验证了所提出的自举方差估计器在低、中、高误分类率下的估计平均值(患病率)。这种方法在误分类率高的情况下特别有用。最后,我们用一个真实世界的非概率样本说明了两步方法的应用,以估计性少数群体男性中社会联系措施的流行程度。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
求助全文
约1分钟内获得全文 求助全文
来源期刊
American journal of epidemiology
American journal of epidemiology 医学-公共卫生、环境卫生与职业卫生
CiteScore
7.40
自引率
4.00%
发文量
221
审稿时长
3-6 weeks
期刊介绍: The American Journal of Epidemiology is the oldest and one of the premier epidemiologic journals devoted to the publication of empirical research findings, opinion pieces, and methodological developments in the field of epidemiologic research. It is a peer-reviewed journal aimed at both fellow epidemiologists and those who use epidemiologic data, including public health workers and clinicians.
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
确定
请完成安全验证×
copy
已复制链接
快去分享给好友吧!
我知道了
右上角分享
点击右上角分享
0
联系我们:info@booksci.cn Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。 Copyright © 2023 布克学术 All rights reserved.
京ICP备2023020795号-1
ghs 京公网安备 11010802042870号
Book学术文献互助
Book学术文献互助群
群 号:604180095
Book学术官方微信