Journal of Survey Statistics and Methodology最新文献

Toward a Principled Workflow for Prevalence Mapping Using Household Survey Data. 基于住户调查数据的患病率制图原则工作流程研究。

IF 1.6 4区数学

Journal of Survey Statistics and Methodology Pub Date : 2026-02-06 eCollection Date: 2026-02-01 DOI: 10.1093/jssam/smaf048

Qianyu Dong, Yunhan Wu, Zehang Richard Li, Jon Wakefield

{"title":"Toward a Principled Workflow for Prevalence Mapping Using Household Survey Data.","authors":"Qianyu Dong, Yunhan Wu, Zehang Richard Li, Jon Wakefield","doi":"10.1093/jssam/smaf048","DOIUrl":"https://doi.org/10.1093/jssam/smaf048","url":null,"abstract":"Understanding the prevalence of key demographic and health indicators in small geographic areas and domains is of global interest, especially in low- and middle-income countries (LMICs), where vital registration data is sparse and household surveys are the primary source of information. Recent advances in computation and the increasing availability of spatially detailed datasets have led to much progress in sophisticated statistical modeling of prevalence. As a result, high-resolution prevalence maps for many indicators are routinely produced in the literature. However, statistical and practical guidance for producing prevalence maps in LMICs has been largely lacking. In particular, advice in choosing and evaluating models and interpreting results is needed, especially when data is limited. Software and analysis tools are also usually inaccessible to researchers in low-resource settings to conduct their own analysis or reproduce findings in the literature. In this paper, we propose a general workflow for prevalence mapping using household survey data. We consider all stages of the analysis pipeline, with particular emphasis on model choice and interpretation. We illustrate the proposed workflow using a case study mapping the proportion of pregnant women who had at least four antenatal care visits in Kenya. The workflow is implemented using the R package surveyPrev, and all reproducible code is provided in the Supplementary Materials. It can be readily extended to a wide range of indicators.","PeriodicalId":17146,"journal":{"name":"Journal of Survey Statistics and Methodology","volume":"14 1","pages":"209-237"},"PeriodicalIF":1.6,"publicationDate":"2026-02-06","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC13118171/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"147774817","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"数学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 0

Direct-Assisted Bayesian Unit-level Modeling for Small Area Estimation of Rare Event Prevalence. 小区域罕见事件发生率估算的直接辅助贝叶斯单元模型。

IF 1.6 4区数学

Journal of Survey Statistics and Methodology Pub Date : 2026-02-01 Epub Date: 2026-03-12 DOI: 10.1093/jssam/smaf037

Alana McGovern, Katherine Wilson, Jon Wakefield

{"title":"Direct-Assisted Bayesian Unit-level Modeling for Small Area Estimation of Rare Event Prevalence.","authors":"Alana McGovern, Katherine Wilson, Jon Wakefield","doi":"10.1093/jssam/smaf037","DOIUrl":"10.1093/jssam/smaf037","url":null,"abstract":"Small area estimation using survey data can be achieved by using either a design-based or a model-based inferential approach. Design-based direct estimators are generally preferable because of their consistency, asymptotic normality, and reliance on fewer assumptions. However, when data are sparse at the desired area level, as is often the case when measuring rare events, these direct estimators can have extremely large uncertainty, making a model-based approach preferable. A model-based approach with a random spatial effect borrows information from surrounding areas at the cost of inducing shrinkage. As a result, estimates may be over-smoothed and inconsistent with design-based estimates at higher area levels when aggregated. We propose two unit-level Bayesian models for small area estimation of rare event prevalence which use design-based direct estimates at a higher area level to increase consistency in aggregation. This model framework is designed to accommodate sparse data obtained from two-stage stratified cluster sampling, which is particularly relevant to applications in low- and middle-income countries. After introducing the model framework and its implementation, we conduct a simulation study to evaluate its properties and apply it to the estimation of the neonatal mortality rate in Zambia, using 2014 Demographic Health Surveys data.","PeriodicalId":17146,"journal":{"name":"Journal of Survey Statistics and Methodology","volume":"14 1","pages":"178-208"},"PeriodicalIF":1.6,"publicationDate":"2026-02-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC13124199/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"147774774","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"数学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 0

MEETING DATA COLLECTION GOALS QUICKER: AN EXPERIMENTAL EVALUATION TO REDUCE FIELDWORK DURATION IN A MIXED-MODE PANEL STUDY. 更快地满足数据收集目标：在混合模式面板研究中减少实地工作时间的实验性评估。

IF 1.6 4区数学

Journal of Survey Statistics and Methodology Pub Date : 2026-01-08 DOI: 10.1093/jssam/smaf030

Katherine A McGonagle, Narayan Sastry

{"title":"MEETING DATA COLLECTION GOALS QUICKER: AN EXPERIMENTAL EVALUATION TO REDUCE FIELDWORK DURATION IN A MIXED-MODE PANEL STUDY.","authors":"Katherine A McGonagle, Narayan Sastry","doi":"10.1093/jssam/smaf030","DOIUrl":"https://doi.org/10.1093/jssam/smaf030","url":null,"abstract":"An experiment was implemented in the 2023 wave of a US household panel study to assess the effects of a shortened field period on data collection outcomes. Following the recent adoption of sequential mixed-mode designs by panel studies worldwide, it has been observed that interview completion for respondents offered the initial mode of web is faster compared to those initially offered the telephone. This study describes an experiment designed to evaluate whether the new mixed-mode designs can support an accelerated field period and achieve cost savings while still meeting fieldwork goals. We assessed a shorter field period of 20weeks against the standard 28-week field period and randomized study participants to each condition. The treatment group received accelerated fieldwork protocols over a 20-week data collection period, and a control group received the same protocols over the standard 28-week period. We compare the effect of the shortened duration on fieldwork outcomes, including response rates, sample composition, interviewer effort, time to interview completion, survey costs, and interview quality. We find that the accelerated protocol yields higher response rates, lower interviewer effort, and cost savings with no differences in sample composition or decrements to interview quality. We describe the strengths and limitations of the study and provide suggestions for future research on fieldwork duration.","PeriodicalId":17146,"journal":{"name":"Journal of Survey Statistics and Methodology","volume":" ","pages":""},"PeriodicalIF":1.6,"publicationDate":"2026-01-08","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC12970957/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"147433864","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"数学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 0

COMPARATIVE EFFECTIVENESS OF PROPENSITY SCORE ESTIMATION METHODS FOR INVERSE PROBABILITY OF TREATMENT WEIGHTING ANALYSIS WITH COMPLEX SURVEY DATA: A SIMULATION STUDY. 复杂调查数据处理加权逆概率分析的倾向得分估计方法的比较有效性：模拟研究。

IF 1.6 4区数学

Journal of Survey Statistics and Methodology Pub Date : 2025-04-12 DOI: 10.1093/jssam/smaf003

Lihua Li, Chen Yang, Liangyuan Hu, Wei Zhang, Melissa Aldridge, Bian Liu, Madhu Mazumdar

{"title":"COMPARATIVE EFFECTIVENESS OF PROPENSITY SCORE ESTIMATION METHODS FOR INVERSE PROBABILITY OF TREATMENT WEIGHTING ANALYSIS WITH COMPLEX SURVEY DATA: A SIMULATION STUDY.","authors":"Lihua Li, Chen Yang, Liangyuan Hu, Wei Zhang, Melissa Aldridge, Bian Liu, Madhu Mazumdar","doi":"10.1093/jssam/smaf003","DOIUrl":"10.1093/jssam/smaf003","url":null,"abstract":"Propensity score (PS) methods, including inverse probability of treatment weighting (IPTW) analysis, are increasingly applied to complex survey data in geriatric studies to infer causal effects. However, the comparative effectiveness of various PS estimation methods, particularly novel machine learning algorithms, has not been thoroughly explored when complex survey data are involved. We conducted a comprehensive simulation study to compare the following six PS estimation methods in IPTW analysis: Logistic Regression, Covariate Balancing Propensity Score, Generalized Boosted Model, Classification and Regression Tree, Random Forest (RF), and Super Learner. We considered 12 scenarios with varying treatment effects, degrees of non-linearity and non-additivity in the associations between covariates and the exposure, and levels of PS overlap. The performance of these six methods was assessed in terms of mean relative bias, root mean square error, and coverage probability. The results showed a similar performance across all methods when PS overlap was strong. However, RF consistently outperformed the other methods when PS overlap was not strong and under non-additive and non-linear scenarios. The results suggest RF to be a more effective approach for PS estimation than the other proposed methods when applying IPTW analysis to complex survey data for population average treatment effects. The methods were applied to data from the Medicare Beneficiary Current Survey for years 2002-2019 to estimate the impact of hospice use on end-of-life healthcare costs. Findings from the real-world example show that hospice use was significantly associated with reduced end-of-life healthcare costs of Medicare Beneficiaries.","PeriodicalId":17146,"journal":{"name":"Journal of Survey Statistics and Methodology","volume":" ","pages":""},"PeriodicalIF":1.6,"publicationDate":"2025-04-12","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC12721855/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"145819878","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"数学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 0

Synthesizing Surveys with Multiple Units of Observation: An Application to the Longitudinal Aging Study in India. 多观测单位综合调查：在印度纵向老龄化研究中的应用。

IF 1.6 4区数学

Journal of Survey Statistics and Methodology Pub Date : 2025-01-09 eCollection Date: 2025-09-01 DOI: 10.1093/jssam/smae047

Joshua Snoke, Erik Meijer, Drystan Phillips, Jenny Wilkens, Jinkook Lee

{"title":"Synthesizing Surveys with Multiple Units of Observation: An Application to the Longitudinal Aging Study in India.","authors":"Joshua Snoke, Erik Meijer, Drystan Phillips, Jenny Wilkens, Jinkook Lee","doi":"10.1093/jssam/smae047","DOIUrl":"10.1093/jssam/smae047","url":null,"abstract":"We present methodology for creating synthetic data and an application to create a publicly releasable synthetic version of the Longitudinal Aging Study in India (LASI). The LASI, a health and retirement survey, is used for research and educational purposes, but it can only be shared under restricted access due to privacy considerations. We present novel methods to synthesize the survey, maintaining three nested levels of observation-individuals, couples, and households-with both continuous and categorical variables and survey weights. We show that the synthetic data maintains the distributional patterns of the confidential data and largely mitigates identification and attribute disclosure risk. We also present a novel method for controlling the risk and utility tradeoff for the synthetic data that take into account the survey sampling rates. Specifically, we down-weight records that have a high likelihood of being uniquely identifiable in the population due to unique demographic information and oversampling. We show this approach reduces both identification and attribute risk for records while preserving better utility over another common approach of coarsening records. Our methods and evaluations provide a foundation for creating a synthetic version of surveys with multiple units of observation, such as the LASI.","PeriodicalId":17146,"journal":{"name":"Journal of Survey Statistics and Methodology","volume":"13 4","pages":"420-444"},"PeriodicalIF":1.6,"publicationDate":"2025-01-09","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC12596149/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"145482443","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"数学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 0

Real World Data Versus Probability Surveys for Estimating Health Conditions at the State Level. 真实世界数据与估计州一级健康状况的概率调查。

IF 1.6 4区数学

Journal of Survey Statistics and Methodology Pub Date : 2024-11-01 DOI: 10.1093/jssam/smae036

David A Marker, Charity Hilton, Jacob Zelko, Jon Duke, Deborah Rolka, Rachel Kaufmann, Richard Boyd

{"title":"Real World Data Versus Probability Surveys for Estimating Health Conditions at the State Level.","authors":"David A Marker, Charity Hilton, Jacob Zelko, Jon Duke, Deborah Rolka, Rachel Kaufmann, Richard Boyd","doi":"10.1093/jssam/smae036","DOIUrl":"10.1093/jssam/smae036","url":null,"abstract":"Government statistical offices worldwide are under pressure to produce statistics rapidly and for more detailed geographies, to compete with unofficial estimates available from web-based big data sources or from private companies. Commonly suggested sources of improved health information are electronic health records (EHRs) and medical claims data. These data sources are collectively known as real world data (RWD) because they are generated from routine health care processes, and they are available for millions of patients. It is clear that RWD can provide estimates that are more timely and less expensive to produce- but a key question is whether or not they are very accurate. To test this, we took advantage of a unique health data source that includes a full range of sociodemographic variables and compare estimates using all of those potential weighting variables, versus estimates derived when only age and sex are available for weighting (as is common with most RWD sources). We show that not accounting for other variables can produce misleading, and quite inaccurate, health estimates.","PeriodicalId":17146,"journal":{"name":"Journal of Survey Statistics and Methodology","volume":"12 5","pages":"1515-1530"},"PeriodicalIF":1.6,"publicationDate":"2024-11-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC11708384/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142950570","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"数学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 0

Analyzing Potential Non-Ignorable Selection Bias in an Off-Wave Mail Survey Implemented in a Long-Standing Panel Study. 在一项长期的小组研究中，分析潜在的不可忽视的选择偏差。

IF 1.6 4区数学

Journal of Survey Statistics and Methodology Pub Date : 2024-10-23 eCollection Date: 2025-02-01 DOI: 10.1093/jssam/smae039

Heather M Schroeder, Brady T West

{"title":"Analyzing Potential Non-Ignorable Selection Bias in an Off-Wave Mail Survey Implemented in a Long-Standing Panel Study.","authors":"Heather M Schroeder, Brady T West","doi":"10.1093/jssam/smae039","DOIUrl":"10.1093/jssam/smae039","url":null,"abstract":"Typical design-based methods for weighting probability samples rely on several assumptions, including the random selection of sampled units according to known probabilities of selection and ignorable unit nonresponse. If any of these assumptions are not met, weighting methods that account for the probabilities of selection, nonresponse, and calibration may not fully account for the potential selection bias in a given sample, which could produce misleading population estimates. This analysis investigates possible selection bias in the 2019 Health Survey Mailer (HSM), a sub-study of the longitudinal Health and Retirement Study (HRS). The primary HRS data collection has occurred in \"even\" years since 1992, but additional survey data collections take place in the \"off-wave\" odd years via mailed invitations sent to selected participants. While the HSM achieved a high response rate (83 percent), the assumption of ignorable probability-based selection of HRS panel members may not hold due to the eligibility criteria that were imposed. To investigate this possible non-ignorable selection bias, our analysis utilizes a novel analysis method for estimating measures of unadjusted bias for proportions (MUBP), introduced by Andridge et al. in 2019. This method incorporates aggregate information from the larger HRS target population, including means, variances, and covariances for key covariates related to the HSM variables, to inform estimates of proportions. We explore potential non-ignorable selection bias by comparing proportions calculated from the HSM under three conditions: ignoring HRS weights, weighting based on the usual design-based approach for HRS \"off-wave\" mail surveys, and using the MUBP adjustment. We find examples of differences between the weighted and MUBP-adjusted estimates in four out of ten outcomes we analyzed. However, these differences are modest, and while this result gives some evidence of non-ignorable selection bias, typical design-based weighting methods are sufficient for correcting for it and their use is appropriate in this case.","PeriodicalId":17146,"journal":{"name":"Journal of Survey Statistics and Methodology","volume":"13 1","pages":"100-127"},"PeriodicalIF":1.6,"publicationDate":"2024-10-23","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC11770253/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"143059427","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"数学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 0

Small Area Poverty Estimation under Heteroskedasticity 异方差下的小地区贫困估计

IF 2.1 4区数学

Journal of Survey Statistics and Methodology Pub Date : 2024-01-10 DOI: 10.1093/jssam/smad045

Sumonkanti Das, Ray Chambers

{"title":"Small Area Poverty Estimation under Heteroskedasticity","authors":"Sumonkanti Das, Ray Chambers","doi":"10.1093/jssam/smad045","DOIUrl":"https://doi.org/10.1093/jssam/smad045","url":null,"abstract":"\u0000 Multilevel models with nested errors are widely used in poverty estimation. An important application in this context is estimating the distribution of poverty as defined by the distribution of income within a set of domains that cover the population of interest. Since unit-level values of income are usually heteroskedastic, the standard homoskedasticity assumptions implicit in popular multilevel models may not be appropriate and can lead to bias, particularly when used to estimate domain-specific income distributions. This article addresses this problem when the income values in the population of interest can be characterized by a two-level mixed linear model with independent and identically distributed domain effects and with independent but not identically distributed individual effects. Estimation of poverty indicators that are functionals of domain-level income distributions is also addressed, and a nonparametric bootstrap procedure is used to estimate mean squared errors and confidence intervals. The proposed methodology is compared with the well-known World Bank poverty mapping methodology for this situation, using model-based simulation experiments as well as an empirical study based on Bangladesh poverty data.","PeriodicalId":17146,"journal":{"name":"Journal of Survey Statistics and Methodology","volume":"50 9","pages":""},"PeriodicalIF":2.1,"publicationDate":"2024-01-10","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"139441260","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"数学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 0

Investigating Respondent Attention to Experimental Text Lengths 调查受访者对实验文本长度的关注度

IF 2.1 4区数学

Journal of Survey Statistics and Methodology Pub Date : 2024-01-04 DOI: 10.1093/jssam/smad044

Tobias Rettig, A. Blom

引用次数: 0

A Catch-22—the Test–Retest Method of Reliability Estimation 自相矛盾--可靠性估计的测试-重测法

IF 2.1 4区数学

Journal of Survey Statistics and Methodology Pub Date : 2023-12-20 DOI: 10.1093/jssam/smad043

Paula A. Tufiș, D. Alwin, Daniel N Ramírez

{"title":"A Catch-22—the Test–Retest Method of Reliability Estimation","authors":"Paula A. Tufiș, D. Alwin, Daniel N Ramírez","doi":"10.1093/jssam/smad043","DOIUrl":"https://doi.org/10.1093/jssam/smad043","url":null,"abstract":"\u0000 This article addresses the problems with the traditional reinterview approach to estimating the reliability of survey measures. Using data from three reinterview (or panel) studies conducted by the General Social Survey, we investigate the differences between the two-wave correlational approach embodied by the traditional reinterview strategy, compared to estimates of reliability that take the stability of traits into account based on a three-wave model. Our results indicate that the problems identified with the two-wave correlational approach reflect a kind of “Catch-22” in the sense that the only solution to the problem is denied by the approach itself. Specifically, we show that the correctly specified two-wave model, which includes the potential for true change in the latent variable, is underidentified, and thus, unless one is willing to make some potentially risky assumptions, reliability parameters are not estimable. This article compares the two-wave correlational approach to an alternative model for estimating reliability, Heise’s estimates based on the three-wave simplex model. Using three waves of data from the GSS panels, which were separated by 2-year intervals between waves, this article examines the conditions under which the wave-1, wave-2 correlations which do not take stability into account approximate the reliability estimate obtained from three-wave simplex models that do take stability into account. The results lead to the conclusion that the differences between estimates depend on the stability and/or fixed nature of the underlying processes involved. Few if any differences are identified when traits are fixed or highly stable, but for traits involving changes in the underlying traits the differences can be quite large, and thus, we argue for the superiority of reinterview designs that involve more than 2 waves in the estimation of reliability parameters.","PeriodicalId":17146,"journal":{"name":"Journal of Survey Statistics and Methodology","volume":"36 20","pages":""},"PeriodicalIF":2.1,"publicationDate":"2023-12-20","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"138955719","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"数学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 0