利用模型辅助校准方法提高复杂调查设计下两相样本或混合样本回归分析的效率。

IF 1.7 4区 数学 Q3 BIOLOGY
Biometrics Pub Date : 2025-07-03 DOI:10.1093/biomtc/ujaf092
Lingxiao Wang
{"title":"利用模型辅助校准方法提高复杂调查设计下两相样本或混合样本回归分析的效率。","authors":"Lingxiao Wang","doi":"10.1093/biomtc/ujaf092","DOIUrl":null,"url":null,"abstract":"<p><p>Two-phase sampling designs are frequently applied in epidemiological studies and large-scale health surveys. In such designs, certain variables are collected exclusively within a second-phase random subsample of the initial first-phase sample, often due to factors such as high costs, response burden, or constraints on data collection or assessment. Consequently, second-phase sample estimators can be inefficient due to the diminished sample size. Model-assisted calibration methods have been used to improve the efficiency of second-phase estimators in regression analysis. However, limited literature provides valid finite population inferences of the calibration estimators that use appropriate calibration auxiliary variables while simultaneously accounting for the complex sample designs in the first- and second-phase samples. Moreover, no literature considers the \"pooled design\" where some covariates are measured exclusively in certain repeated survey cycles. This paper proposes calibrating the sample weights for the second-phase sample to the weighted first-phase sample based on score functions of the regression model that uses predictions of the second-phase variable for the first-phase sample. We establish the consistency of estimation using calibrated weights and provide variance estimation for the regression coefficients under the two-phase design or the pooled design nested within complex survey designs. Empirical evidence highlights the efficiency and robustness of the proposed calibration compared to existing calibration and imputation methods. Data examples from the National Health and Nutrition Examination Survey are provided.</p>","PeriodicalId":8930,"journal":{"name":"Biometrics","volume":"81 3","pages":""},"PeriodicalIF":1.7000,"publicationDate":"2025-07-03","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC12288669/pdf/","citationCount":"0","resultStr":"{\"title\":\"Using model-assisted calibration methods to improve efficiency of regression analyses using two-phase samples or pooled samples under complex survey designs.\",\"authors\":\"Lingxiao Wang\",\"doi\":\"10.1093/biomtc/ujaf092\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"<p><p>Two-phase sampling designs are frequently applied in epidemiological studies and large-scale health surveys. In such designs, certain variables are collected exclusively within a second-phase random subsample of the initial first-phase sample, often due to factors such as high costs, response burden, or constraints on data collection or assessment. Consequently, second-phase sample estimators can be inefficient due to the diminished sample size. Model-assisted calibration methods have been used to improve the efficiency of second-phase estimators in regression analysis. However, limited literature provides valid finite population inferences of the calibration estimators that use appropriate calibration auxiliary variables while simultaneously accounting for the complex sample designs in the first- and second-phase samples. Moreover, no literature considers the \\\"pooled design\\\" where some covariates are measured exclusively in certain repeated survey cycles. This paper proposes calibrating the sample weights for the second-phase sample to the weighted first-phase sample based on score functions of the regression model that uses predictions of the second-phase variable for the first-phase sample. We establish the consistency of estimation using calibrated weights and provide variance estimation for the regression coefficients under the two-phase design or the pooled design nested within complex survey designs. Empirical evidence highlights the efficiency and robustness of the proposed calibration compared to existing calibration and imputation methods. Data examples from the National Health and Nutrition Examination Survey are provided.</p>\",\"PeriodicalId\":8930,\"journal\":{\"name\":\"Biometrics\",\"volume\":\"81 3\",\"pages\":\"\"},\"PeriodicalIF\":1.7000,\"publicationDate\":\"2025-07-03\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC12288669/pdf/\",\"citationCount\":\"0\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"Biometrics\",\"FirstCategoryId\":\"100\",\"ListUrlMain\":\"https://doi.org/10.1093/biomtc/ujaf092\",\"RegionNum\":4,\"RegionCategory\":\"数学\",\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"Q3\",\"JCRName\":\"BIOLOGY\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"Biometrics","FirstCategoryId":"100","ListUrlMain":"https://doi.org/10.1093/biomtc/ujaf092","RegionNum":4,"RegionCategory":"数学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q3","JCRName":"BIOLOGY","Score":null,"Total":0}
引用次数: 0

摘要

两阶段抽样设计常用于流行病学研究和大规模健康调查。在这种设计中,某些变量只在初始第一阶段样本的第二阶段随机子样本中收集,这通常是由于诸如高成本、响应负担或数据收集或评估的限制等因素。因此,由于样本量的减少,第二阶段的样本估计器可能是低效的。模型辅助校正方法用于提高回归分析中第二阶段估计器的效率。然而,有限的文献提供了使用适当的校准辅助变量的校准估计器的有效有限总体推断,同时考虑了第一阶段和第二阶段样本的复杂样本设计。此外,没有文献考虑在某些重复调查周期中只测量某些协变量的“合并设计”。本文提出基于使用第二阶段变量对第一阶段样本的预测的回归模型的得分函数,将第二阶段样本的样本权重校准为加权的第一阶段样本。我们使用校准的权重来建立估计的一致性,并对两阶段设计或复杂调查设计中嵌套的合并设计下的回归系数进行方差估计。与现有的校准和插值方法相比,经验证据突出了所提出的校准的效率和鲁棒性。提供了来自全国健康和营养检查调查的数据实例。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
Using model-assisted calibration methods to improve efficiency of regression analyses using two-phase samples or pooled samples under complex survey designs.

Two-phase sampling designs are frequently applied in epidemiological studies and large-scale health surveys. In such designs, certain variables are collected exclusively within a second-phase random subsample of the initial first-phase sample, often due to factors such as high costs, response burden, or constraints on data collection or assessment. Consequently, second-phase sample estimators can be inefficient due to the diminished sample size. Model-assisted calibration methods have been used to improve the efficiency of second-phase estimators in regression analysis. However, limited literature provides valid finite population inferences of the calibration estimators that use appropriate calibration auxiliary variables while simultaneously accounting for the complex sample designs in the first- and second-phase samples. Moreover, no literature considers the "pooled design" where some covariates are measured exclusively in certain repeated survey cycles. This paper proposes calibrating the sample weights for the second-phase sample to the weighted first-phase sample based on score functions of the regression model that uses predictions of the second-phase variable for the first-phase sample. We establish the consistency of estimation using calibrated weights and provide variance estimation for the regression coefficients under the two-phase design or the pooled design nested within complex survey designs. Empirical evidence highlights the efficiency and robustness of the proposed calibration compared to existing calibration and imputation methods. Data examples from the National Health and Nutrition Examination Survey are provided.

求助全文
通过发布文献求助,成功后即可免费获取论文全文。 去求助
来源期刊
Biometrics
Biometrics 生物-生物学
CiteScore
2.70
自引率
5.30%
发文量
178
审稿时长
4-8 weeks
期刊介绍: The International Biometric Society is an international society promoting the development and application of statistical and mathematical theory and methods in the biosciences, including agriculture, biomedical science and public health, ecology, environmental sciences, forestry, and allied disciplines. The Society welcomes as members statisticians, mathematicians, biological scientists, and others devoted to interdisciplinary efforts in advancing the collection and interpretation of information in the biosciences. The Society sponsors the biennial International Biometric Conference, held in sites throughout the world; through its National Groups and Regions, it also Society sponsors regional and local meetings.
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
确定
请完成安全验证×
copy
已复制链接
快去分享给好友吧!
我知道了
右上角分享
点击右上角分享
0
联系我们:info@booksci.cn Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。 Copyright © 2023 布克学术 All rights reserved.
京ICP备2023020795号-1
ghs 京公网安备 11010802042870号
Book学术文献互助
Book学术文献互助群
群 号:604180095
Book学术官方微信