当设计变量已知为总量时，癌症服务调查中设计一致回归的最佳样本分配。

IF 1.2 4区数学 Q3 SOCIAL SCIENCES, MATHEMATICAL METHODS

Survey Methodology Pub Date : 2008-06-01

Alan M Zaslavsky, Hui Zheng, John Adams

{"title":"当设计变量已知为总量时，癌症服务调查中设计一致回归的最佳样本分配。","authors":"Alan M Zaslavsky, Hui Zheng, John Adams","doi":"","DOIUrl":null,"url":null,"abstract":"We consider optimal sampling rates in element-sampling designs when the anticipated analysis is survey-weighted linear regression and the estimands of interest are linear combinations of regression coefficients from one or more models. Methods are first developed assuming that exact design information is available in the sampling frame and then generalized to situations in which some design variables are available only as aggregates for groups of potential subjects, or from inaccurate or old data. We also consider design for estimation of combinations of coefficients from more than one model. A further generalization allows for flexible combinations of coefficients chosen to improve estimation of one effect while controlling for another. Potential applications include estimation of means for several sets of overlapping domains, or improving estimates for subpopulations such as minority races by disproportionate sampling of geographic areas. In the motivating problem of designing a survey on care received by cancer patients (the CanCORS study), potential design information included block-level census data on race/ethnicity and poverty as well as individual-level data. In one study site, an unequal-probability sampling design using the subjectss residential addresses and census data would have reduced the variance of the estimator of an income effect by 25%, or by 38% if the subjects' races were also known. With flexible weighting of the income contrasts by race, the variance of the estimator would be reduced by 26% using residential addresses alone and by 52% using addresses and races. Our methods would be useful in studies in which geographic oversampling by race-ethnicity or socioeconomic characteristics is considered, or in any study in which characteristics available in sampling frames are measured with error.","PeriodicalId":51191,"journal":{"name":"Survey Methodology","volume":"34 1","pages":"65-78"},"PeriodicalIF":1.2000,"publicationDate":"2008-06-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC2725367/pdf/nihms-105215.pdf","citationCount":"0","resultStr":"{\"title\":\"Optimal sample allocation for design-consistent regression in a cancer services survey when design variables are known for aggregates.\",\"authors\":\"Alan M Zaslavsky, Hui Zheng, John Adams\",\"doi\":\"\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"We consider optimal sampling rates in element-sampling designs when the anticipated analysis is survey-weighted linear regression and the estimands of interest are linear combinations of regression coefficients from one or more models. Methods are first developed assuming that exact design information is available in the sampling frame and then generalized to situations in which some design variables are available only as aggregates for groups of potential subjects, or from inaccurate or old data. We also consider design for estimation of combinations of coefficients from more than one model. A further generalization allows for flexible combinations of coefficients chosen to improve estimation of one effect while controlling for another. Potential applications include estimation of means for several sets of overlapping domains, or improving estimates for subpopulations such as minority races by disproportionate sampling of geographic areas. In the motivating problem of designing a survey on care received by cancer patients (the CanCORS study), potential design information included block-level census data on race/ethnicity and poverty as well as individual-level data. In one study site, an unequal-probability sampling design using the subjectss residential addresses and census data would have reduced the variance of the estimator of an income effect by 25%, or by 38% if the subjects' races were also known. With flexible weighting of the income contrasts by race, the variance of the estimator would be reduced by 26% using residential addresses alone and by 52% using addresses and races. Our methods would be useful in studies in which geographic oversampling by race-ethnicity or socioeconomic characteristics is considered, or in any study in which characteristics available in sampling frames are measured with error.\",\"PeriodicalId\":51191,\"journal\":{\"name\":\"Survey Methodology\",\"volume\":\"34 1\",\"pages\":\"65-78\"},\"PeriodicalIF\":1.2000,\"publicationDate\":\"2008-06-01\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC2725367/pdf/nihms-105215.pdf\",\"citationCount\":\"0\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"Survey Methodology\",\"FirstCategoryId\":\"100\",\"ListUrlMain\":\"\",\"RegionNum\":4,\"RegionCategory\":\"数学\",\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"Q3\",\"JCRName\":\"SOCIAL SCIENCES, MATHEMATICAL METHODS\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"Survey Methodology","FirstCategoryId":"100","ListUrlMain":"","RegionNum":4,"RegionCategory":"数学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q3","JCRName":"SOCIAL SCIENCES, MATHEMATICAL METHODS","Score":null,"Total":0}

引用次数: 0

摘要

当预期分析是调查加权线性回归，而感兴趣的估计值是来自一个或多个模型的回归系数的线性组合时，我们考虑元素抽样设计中的最佳抽样率。方法首先是假设在抽样框架中可以获得精确的设计信息，然后将其推广到某些设计变量只能作为潜在受试者群体的总和或来自不准确或旧数据的情况。我们还考虑了从多个模型中估计系数组合的设计。进一步的推广允许选择系数的灵活组合，以改进对一种效应的估计，同时控制另一种效应。潜在的应用包括对几组重叠域的均值估计，或通过对地理区域进行不成比例的抽样来改进对少数民族等亚种群的估计。在设计癌症患者接受治疗调查的激励问题（CanCORS研究）中，潜在的设计信息包括种族/民族和贫困的块级人口普查数据以及个人水平的数据。在一个研究地点，使用受试者的居住地址和人口普查数据的非等概率抽样设计可以将收入效应估计值的方差减少25%，如果受试者的种族也已知，则可以减少38%。如果按种族对收入对比进行灵活的加权，仅使用居住地址估算器的方差将减少26%，使用地址和种族估算器的方差将减少52%。我们的方法在考虑种族或社会经济特征的地理过采样的研究中是有用的，或者在抽样框架中可用的特征有误差测量的任何研究中都是有用的。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

本刊更多论文

Optimal sample allocation for design-consistent regression in a cancer services survey when design variables are known for aggregates.

We consider optimal sampling rates in element-sampling designs when the anticipated analysis is survey-weighted linear regression and the estimands of interest are linear combinations of regression coefficients from one or more models. Methods are first developed assuming that exact design information is available in the sampling frame and then generalized to situations in which some design variables are available only as aggregates for groups of potential subjects, or from inaccurate or old data. We also consider design for estimation of combinations of coefficients from more than one model. A further generalization allows for flexible combinations of coefficients chosen to improve estimation of one effect while controlling for another. Potential applications include estimation of means for several sets of overlapping domains, or improving estimates for subpopulations such as minority races by disproportionate sampling of geographic areas. In the motivating problem of designing a survey on care received by cancer patients (the CanCORS study), potential design information included block-level census data on race/ethnicity and poverty as well as individual-level data. In one study site, an unequal-probability sampling design using the subjectss residential addresses and census data would have reduced the variance of the estimator of an income effect by 25%, or by 38% if the subjects' races were also known. With flexible weighting of the income contrasts by race, the variance of the estimator would be reduced by 26% using residential addresses alone and by 52% using addresses and races. Our methods would be useful in studies in which geographic oversampling by race-ethnicity or socioeconomic characteristics is considered, or in any study in which characteristics available in sampling frames are measured with error.

求助全文

通过发布文献求助，成功后即可免费获取论文全文。去求助

来源期刊

Survey Methodology 数学-统计学与概率论

CiteScore

0.80

自引率

22.20%

发文量

审稿时长

>12 weeks

期刊介绍： The journal publishes articles dealing with various aspects of statistical development relevant to a statistical agency, such as design issues in the context of practical constraints, use of different data sources and collection techniques, total survey error, survey evaluation, research in survey methodology, time series analysis, seasonal adjustment, demographic studies, data integration, estimation and data analysis methods, and general survey systems development. The emphasis is placed on the development and evaluation of specific methodologies as applied to data collection or the data themselves.