Phuoc-Loc Tran, Shen-Ming Lee, Truong-Nhat Le, Chin-Shang Li
{"title":"单独或同时随机缺失协变量的逻辑回归参数的多重估计量的大样本性质","authors":"Phuoc-Loc Tran, Shen-Ming Lee, Truong-Nhat Le, Chin-Shang Li","doi":"10.1007/s10463-024-00914-9","DOIUrl":null,"url":null,"abstract":"<div><p>We examine the asymptotic properties of two multiple imputation (MI) estimators, given in the study of Lee et al. (<u>Computational Statistics</u>, <b>38</b>, 899–934, 2023) for the parameters of logistic regression with both sets of discrete or categorical covariates that are missing at random separately or simultaneously. The proposed estimated asymptotic variances of the two MI estimators address a limitation observed with Rubin’s estimated variances, which lead to underestimate the variances of the two MI estimators (Rubin, 1987, <u>Statistical Analysis with Missing Data</u>, New York:Wiley). Simulation results demonstrate that our two proposed MI methods outperform the complete-case, semiparametric inverse probability weighting, random forest MI using chained equations, and stochastic approximation of expectation-maximization methods. To illustrate the methodology’s practical application, we provide a real data example from a survey conducted at the Feng Chia night market in Taichung City, Taiwan.</p></div>","PeriodicalId":55511,"journal":{"name":"Annals of the Institute of Statistical Mathematics","volume":"77 2","pages":"251 - 287"},"PeriodicalIF":0.8000,"publicationDate":"2024-12-02","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":"{\"title\":\"Large-sample properties of multiple imputation estimators for parameters of logistic regression with covariates missing at random separately or simultaneously\",\"authors\":\"Phuoc-Loc Tran, Shen-Ming Lee, Truong-Nhat Le, Chin-Shang Li\",\"doi\":\"10.1007/s10463-024-00914-9\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"<div><p>We examine the asymptotic properties of two multiple imputation (MI) estimators, given in the study of Lee et al. (<u>Computational Statistics</u>, <b>38</b>, 899–934, 2023) for the parameters of logistic regression with both sets of discrete or categorical covariates that are missing at random separately or simultaneously. The proposed estimated asymptotic variances of the two MI estimators address a limitation observed with Rubin’s estimated variances, which lead to underestimate the variances of the two MI estimators (Rubin, 1987, <u>Statistical Analysis with Missing Data</u>, New York:Wiley). Simulation results demonstrate that our two proposed MI methods outperform the complete-case, semiparametric inverse probability weighting, random forest MI using chained equations, and stochastic approximation of expectation-maximization methods. To illustrate the methodology’s practical application, we provide a real data example from a survey conducted at the Feng Chia night market in Taichung City, Taiwan.</p></div>\",\"PeriodicalId\":55511,\"journal\":{\"name\":\"Annals of the Institute of Statistical Mathematics\",\"volume\":\"77 2\",\"pages\":\"251 - 287\"},\"PeriodicalIF\":0.8000,\"publicationDate\":\"2024-12-02\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"0\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"Annals of the Institute of Statistical Mathematics\",\"FirstCategoryId\":\"100\",\"ListUrlMain\":\"https://link.springer.com/article/10.1007/s10463-024-00914-9\",\"RegionNum\":4,\"RegionCategory\":\"数学\",\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"Q3\",\"JCRName\":\"STATISTICS & PROBABILITY\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"Annals of the Institute of Statistical Mathematics","FirstCategoryId":"100","ListUrlMain":"https://link.springer.com/article/10.1007/s10463-024-00914-9","RegionNum":4,"RegionCategory":"数学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q3","JCRName":"STATISTICS & PROBABILITY","Score":null,"Total":0}
引用次数: 0
摘要
我们研究了Lee等人(Computational Statistics, 38,899 - 934,2023)研究中给出的两个多重imputation (MI)估计量的渐近性质,用于分别或同时随机缺失的离散或分类协变量的逻辑回归参数。提出的两个MI估计量的估计渐近方差解决了Rubin估计方差观察到的局限性,这导致低估了两个MI估计量的方差(Rubin, 1987, Statistical Analysis with Missing Data, New York:Wiley)。仿真结果表明,我们提出的两种MI方法优于完全情况、半参数逆概率加权、链式方程随机森林MI和期望最大化随机逼近方法。为了说明该方法的实际应用,我们以台湾台中市奉家夜市的调查数据为例。
Large-sample properties of multiple imputation estimators for parameters of logistic regression with covariates missing at random separately or simultaneously
We examine the asymptotic properties of two multiple imputation (MI) estimators, given in the study of Lee et al. (Computational Statistics, 38, 899–934, 2023) for the parameters of logistic regression with both sets of discrete or categorical covariates that are missing at random separately or simultaneously. The proposed estimated asymptotic variances of the two MI estimators address a limitation observed with Rubin’s estimated variances, which lead to underestimate the variances of the two MI estimators (Rubin, 1987, Statistical Analysis with Missing Data, New York:Wiley). Simulation results demonstrate that our two proposed MI methods outperform the complete-case, semiparametric inverse probability weighting, random forest MI using chained equations, and stochastic approximation of expectation-maximization methods. To illustrate the methodology’s practical application, we provide a real data example from a survey conducted at the Feng Chia night market in Taichung City, Taiwan.
期刊介绍:
Annals of the Institute of Statistical Mathematics (AISM) aims to provide a forum for open communication among statisticians, and to contribute to the advancement of statistics as a science to enable humans to handle information in order to cope with uncertainties. It publishes high-quality papers that shed new light on the theoretical, computational and/or methodological aspects of statistical science. Emphasis is placed on (a) development of new methodologies motivated by real data, (b) development of unifying theories, and (c) analysis and improvement of existing methodologies and theories.