一种生成合成群体的非参数方法,用于调整复杂的抽样设计特征。

IF 1.2 4区 数学 Q3 SOCIAL SCIENCES, MATHEMATICAL METHODS
Survey Methodology Pub Date : 2014-06-01 Epub Date: 2014-06-27
Qi Dong, Michael R Elliott, Trivellore E Raghunathan
{"title":"一种生成合成群体的非参数方法,用于调整复杂的抽样设计特征。","authors":"Qi Dong, Michael R Elliott, Trivellore E Raghunathan","doi":"","DOIUrl":null,"url":null,"abstract":"<p><p>Outside of the survey sampling literature, samples are often assumed to be generated by a simple random sampling process that produces independent and identically distributed (IID) samples. Many statistical methods are developed largely in this IID world. Application of these methods to data from complex sample surveys without making allowance for the survey design features can lead to erroneous inferences. Hence, much time and effort have been devoted to develop the statistical methods to analyze complex survey data and account for the sample design. This issue is particularly important when generating synthetic populations using finite population Bayesian inference, as is often done in missing data or disclosure risk settings, or when combining data from multiple surveys. By extending previous work in finite population Bayesian bootstrap literature, we propose a method to generate synthetic populations from a posterior predictive distribution in a fashion inverts the complex sampling design features and generates simple random samples from a superpopulation point of view, making adjustment on the complex data so that they can be analyzed as simple random samples. We consider a simulation study with a stratified, clustered unequal-probability of selection sample design, and use the proposed nonparametric method to generate synthetic populations for the 2006 National Health Interview Survey (NHIS), and the Medical Expenditure Panel Survey (MEPS), which are stratified, clustered unequal-probability of selection sample designs.</p>","PeriodicalId":51191,"journal":{"name":"Survey Methodology","volume":null,"pages":null},"PeriodicalIF":1.2000,"publicationDate":"2014-06-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC5708580/pdf/nihms921248.pdf","citationCount":"0","resultStr":"{\"title\":\"A nonparametric method to generate synthetic populations to adjust for complex sampling design features.\",\"authors\":\"Qi Dong, Michael R Elliott, Trivellore E Raghunathan\",\"doi\":\"\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"<p><p>Outside of the survey sampling literature, samples are often assumed to be generated by a simple random sampling process that produces independent and identically distributed (IID) samples. Many statistical methods are developed largely in this IID world. Application of these methods to data from complex sample surveys without making allowance for the survey design features can lead to erroneous inferences. Hence, much time and effort have been devoted to develop the statistical methods to analyze complex survey data and account for the sample design. This issue is particularly important when generating synthetic populations using finite population Bayesian inference, as is often done in missing data or disclosure risk settings, or when combining data from multiple surveys. By extending previous work in finite population Bayesian bootstrap literature, we propose a method to generate synthetic populations from a posterior predictive distribution in a fashion inverts the complex sampling design features and generates simple random samples from a superpopulation point of view, making adjustment on the complex data so that they can be analyzed as simple random samples. We consider a simulation study with a stratified, clustered unequal-probability of selection sample design, and use the proposed nonparametric method to generate synthetic populations for the 2006 National Health Interview Survey (NHIS), and the Medical Expenditure Panel Survey (MEPS), which are stratified, clustered unequal-probability of selection sample designs.</p>\",\"PeriodicalId\":51191,\"journal\":{\"name\":\"Survey Methodology\",\"volume\":null,\"pages\":null},\"PeriodicalIF\":1.2000,\"publicationDate\":\"2014-06-01\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC5708580/pdf/nihms921248.pdf\",\"citationCount\":\"0\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"Survey Methodology\",\"FirstCategoryId\":\"100\",\"ListUrlMain\":\"\",\"RegionNum\":4,\"RegionCategory\":\"数学\",\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"2014/6/27 0:00:00\",\"PubModel\":\"Epub\",\"JCR\":\"Q3\",\"JCRName\":\"SOCIAL SCIENCES, MATHEMATICAL METHODS\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"Survey Methodology","FirstCategoryId":"100","ListUrlMain":"","RegionNum":4,"RegionCategory":"数学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"2014/6/27 0:00:00","PubModel":"Epub","JCR":"Q3","JCRName":"SOCIAL SCIENCES, MATHEMATICAL METHODS","Score":null,"Total":0}
引用次数: 0

摘要

在调查抽样文献之外,通常假定样本是由简单的随机抽样过程产生的,该过程会产生独立且同分布(IID)的样本。许多统计方法主要就是在这种 IID 世界中发展起来的。将这些方法应用于复杂抽样调查的数据时,如果不考虑调查设计的特点,可能会导致错误的推论。因此,人们投入了大量的时间和精力来开发统计方法,以分析复杂的调查数据并考虑样本设计。在使用有限总体贝叶斯推断法生成合成总体时,这个问题尤为重要,因为在缺失数据或披露风险环境下,或者在合并来自多个调查的数据时,经常会出现这种情况。通过扩展有限种群贝叶斯引导文献中的前人工作,我们提出了一种从后验预测分布生成合成种群的方法,该方法反转了复杂抽样设计的特征,并从超种群的角度生成简单随机样本,对复杂数据进行调整,使其可以作为简单随机样本进行分析。我们考虑了分层聚类不等概率抽样设计的模拟研究,并使用所提出的非参数方法生成了 2006 年全国健康访谈调查(NHIS)和医疗支出面板调查(MEPS)的合成人群,这两个调查都是分层聚类不等概率抽样设计。
本文章由计算机程序翻译,如有差异,请以英文原文为准。

A nonparametric method to generate synthetic populations to adjust for complex sampling design features.

A nonparametric method to generate synthetic populations to adjust for complex sampling design features.

Outside of the survey sampling literature, samples are often assumed to be generated by a simple random sampling process that produces independent and identically distributed (IID) samples. Many statistical methods are developed largely in this IID world. Application of these methods to data from complex sample surveys without making allowance for the survey design features can lead to erroneous inferences. Hence, much time and effort have been devoted to develop the statistical methods to analyze complex survey data and account for the sample design. This issue is particularly important when generating synthetic populations using finite population Bayesian inference, as is often done in missing data or disclosure risk settings, or when combining data from multiple surveys. By extending previous work in finite population Bayesian bootstrap literature, we propose a method to generate synthetic populations from a posterior predictive distribution in a fashion inverts the complex sampling design features and generates simple random samples from a superpopulation point of view, making adjustment on the complex data so that they can be analyzed as simple random samples. We consider a simulation study with a stratified, clustered unequal-probability of selection sample design, and use the proposed nonparametric method to generate synthetic populations for the 2006 National Health Interview Survey (NHIS), and the Medical Expenditure Panel Survey (MEPS), which are stratified, clustered unequal-probability of selection sample designs.

求助全文
通过发布文献求助,成功后即可免费获取论文全文。 去求助
来源期刊
Survey Methodology
Survey Methodology 数学-统计学与概率论
CiteScore
0.80
自引率
22.20%
发文量
0
审稿时长
>12 weeks
期刊介绍: The journal publishes articles dealing with various aspects of statistical development relevant to a statistical agency, such as design issues in the context of practical constraints, use of different data sources and collection techniques, total survey error, survey evaluation, research in survey methodology, time series analysis, seasonal adjustment, demographic studies, data integration, estimation and data analysis methods, and general survey systems development. The emphasis is placed on the development and evaluation of specific methodologies as applied to data collection or the data themselves.
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
确定
请完成安全验证×
copy
已复制链接
快去分享给好友吧!
我知道了
右上角分享
点击右上角分享
0
联系我们:info@booksci.cn Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。 Copyright © 2023 布克学术 All rights reserved.
京ICP备2023020795号-1
ghs 京公网安备 11010802042870号
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术官方微信