Kpop：在调查加权中减少规范假设的核平衡方法。

IF 1.6 3区数学 Q2 SOCIAL SCIENCES, MATHEMATICAL METHODS

Journal of the Royal Statistical Society Series A-Statistics in Society Pub Date : 2025-07-01 Epub Date: 2024-09-02 DOI:10.1093/jrsssa/qnae082

Erin Hartman, Chad Hazlett, Ciara Sterbenz

{"title":"Kpop：在调查加权中减少规范假设的核平衡方法。","authors":"Erin Hartman, Chad Hazlett, Ciara Sterbenz","doi":"10.1093/jrsssa/qnae082","DOIUrl":null,"url":null,"abstract":"With the precipitous decline in response rates, researchers and pollsters have been left with highly nonrepresentative samples, relying on constructed weights to make these samples representative of the desired target population. Though practitioners employ valuable expert knowledge to choose what variables <math><mrow><mi>X</mi></mrow> </math> must be adjusted for, they rarely defend particular functional forms relating these variables to the response process or the outcome. Unfortunately, commonly used calibration weights-which make the weighted mean of <math><mrow><mi>X</mi></mrow> </math> in the sample equal that of the population-only ensure correct adjustment when the portion of the outcome and the response process left unexplained by linear functions of <math><mrow><mi>X</mi></mrow> </math> are independent. To alleviate this functional form dependency, we describe kernel balancing for population weighting (kpop). This approach replaces the design matrix <math><mrow><mtext>X</mtext></mrow> </math> with a kernel matrix, <math><mrow><mtext>K</mtext></mrow> </math> encoding high-order information about <math><mrow><mtext>X</mtext></mrow> </math> . Weights are then found to make the weighted average row of <math><mrow><mtext>K</mtext></mrow> </math> among sampled units approximately equal to that of the target population. This produces good calibration on a wide range of smooth functions of <math><mrow><mi>X</mi></mrow> </math> , without relying on the user to decide which <math><mrow><mi>X</mi></mrow> </math> or what functions of them to include. We describe the method and illustrate it by application to polling data from the 2016 US presidential election.","PeriodicalId":49983,"journal":{"name":"Journal of the Royal Statistical Society Series A-Statistics in Society","volume":"188 3","pages":"875-895"},"PeriodicalIF":1.6000,"publicationDate":"2025-07-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC12352454/pdf/","citationCount":"0","resultStr":"{\"title\":\"kpop: a kernel balancing approach for reducing specification assumptions in survey weighting.\",\"authors\":\"Erin Hartman, Chad Hazlett, Ciara Sterbenz\",\"doi\":\"10.1093/jrsssa/qnae082\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"With the precipitous decline in response rates, researchers and pollsters have been left with highly nonrepresentative samples, relying on constructed weights to make these samples representative of the desired target population. Though practitioners employ valuable expert knowledge to choose what variables <math><mrow><mi>X</mi></mrow> </math> must be adjusted for, they rarely defend particular functional forms relating these variables to the response process or the outcome. Unfortunately, commonly used calibration weights-which make the weighted mean of <math><mrow><mi>X</mi></mrow> </math> in the sample equal that of the population-only ensure correct adjustment when the portion of the outcome and the response process left unexplained by linear functions of <math><mrow><mi>X</mi></mrow> </math> are independent. To alleviate this functional form dependency, we describe kernel balancing for population weighting (kpop). This approach replaces the design matrix <math><mrow><mtext>X</mtext></mrow> </math> with a kernel matrix, <math><mrow><mtext>K</mtext></mrow> </math> encoding high-order information about <math><mrow><mtext>X</mtext></mrow> </math> . Weights are then found to make the weighted average row of <math><mrow><mtext>K</mtext></mrow> </math> among sampled units approximately equal to that of the target population. This produces good calibration on a wide range of smooth functions of <math><mrow><mi>X</mi></mrow> </math> , without relying on the user to decide which <math><mrow><mi>X</mi></mrow> </math> or what functions of them to include. We describe the method and illustrate it by application to polling data from the 2016 US presidential election.\",\"PeriodicalId\":49983,\"journal\":{\"name\":\"Journal of the Royal Statistical Society Series A-Statistics in Society\",\"volume\":\"188 3\",\"pages\":\"875-895\"},\"PeriodicalIF\":1.6000,\"publicationDate\":\"2025-07-01\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC12352454/pdf/\",\"citationCount\":\"0\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"Journal of the Royal Statistical Society Series A-Statistics in Society\",\"FirstCategoryId\":\"100\",\"ListUrlMain\":\"https://doi.org/10.1093/jrsssa/qnae082\",\"RegionNum\":3,\"RegionCategory\":\"数学\",\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"2024/9/2 0:00:00\",\"PubModel\":\"Epub\",\"JCR\":\"Q2\",\"JCRName\":\"SOCIAL SCIENCES, MATHEMATICAL METHODS\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"Journal of the Royal Statistical Society Series A-Statistics in Society","FirstCategoryId":"100","ListUrlMain":"https://doi.org/10.1093/jrsssa/qnae082","RegionNum":3,"RegionCategory":"数学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"2024/9/2 0:00:00","PubModel":"Epub","JCR":"Q2","JCRName":"SOCIAL SCIENCES, MATHEMATICAL METHODS","Score":null,"Total":0}

引用次数: 0

摘要

随着回复率的急剧下降，研究人员和民意测验专家留下了高度不具代表性的样本，依靠构建的权重来使这些样本代表期望的目标人群。尽管从业者使用有价值的专家知识来选择X必须调整的变量，但他们很少为这些变量与响应过程或结果相关的特定功能形式辩护。不幸的是，通常使用的校准权重——使样本中X的加权平均值等于总体的加权平均值——只有在X的线性函数无法解释的部分结果和响应过程是独立的情况下才能确保正确的调整。为了减轻这种功能形式依赖，我们描述了人口加权（kpop）的内核平衡。这种方法将设计矩阵X替换为核矩阵，K编码关于X的高阶信息。然后找到权重，使抽样单位中K的加权平均行近似等于目标总体的加权平均行。这对X的各种平滑函数产生了良好的校准，而不依赖于用户决定包含哪个X或其中的哪些函数。我们描述了该方法，并通过应用于2016年美国总统大选的民意调查数据来说明它。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

查看原文本刊更多论文

kpop: a kernel balancing approach for reducing specification assumptions in survey weighting.

With the precipitous decline in response rates, researchers and pollsters have been left with highly nonrepresentative samples, relying on constructed weights to make these samples representative of the desired target population. Though practitioners employ valuable expert knowledge to choose what variables $X$ must be adjusted for, they rarely defend particular functional forms relating these variables to the response process or the outcome. Unfortunately, commonly used calibration weights-which make the weighted mean of $X$ in the sample equal that of the population-only ensure correct adjustment when the portion of the outcome and the response process left unexplained by linear functions of $X$ are independent. To alleviate this functional form dependency, we describe kernel balancing for population weighting (kpop). This approach replaces the design matrix $X$ with a kernel matrix, $K$ encoding high-order information about $X$ . Weights are then found to make the weighted average row of $K$ among sampled units approximately equal to that of the target population. This produces good calibration on a wide range of smooth functions of $X$ , without relying on the user to decide which $X$ or what functions of them to include. We describe the method and illustrate it by application to polling data from the 2016 US presidential election.

求助全文

通过发布文献求助，成功后即可免费获取论文全文。去求助

来源期刊

Journal of the Royal Statistical Society Series A-Statistics in Society 数学-统计学与概率论

CiteScore

2.90

自引率

5.00%

发文量

136

审稿时长

>12 weeks

期刊介绍： Series A (Statistics in Society) publishes high quality papers that demonstrate how statistical thinking, design and analyses play a vital role in all walks of life and benefit society in general. There is no restriction on subject-matter: any interesting, topical and revelatory applications of statistics are welcome. For example, important applications of statistical and related data science methodology in medicine, business and commerce, industry, economics and finance, education and teaching, physical and biomedical sciences, the environment, the law, government and politics, demography, psychology, sociology and sport all fall within the journal''s remit. The journal is therefore aimed at a wide statistical audience and at professional statisticians in particular. Its emphasis is on well-written and clearly reasoned quantitative approaches to problems in the real world rather than the exposition of technical detail. Thus, although the methodological basis of papers must be sound and adequately explained, methodology per se should not be the main focus of a Series A paper. Of particular interest are papers on topical or contentious statistical issues, papers which give reviews or exposés of current statistical concerns and papers which demonstrate how appropriate statistical thinking has contributed to our understanding of important substantive questions. Historical, professional and biographical contributions are also welcome, as are discussions of methods of data collection and of ethical issues, provided that all such papers have substantial statistical relevance.