{"title":"考虑缺失数据的两阶段高效通信分布稀疏m估计","authors":"Xudong Zhang, Ting Zhang, Lei Wang","doi":"10.1080/02331888.2023.2201505","DOIUrl":null,"url":null,"abstract":"Distributed estimation based on different sources of observations has drawn attention in the modern statistical learning. When the distributed data are missing at random, we propose a two-stage -penalized communication-efficient surrogate likelihood (CSL) algorithm based on inverse probability weighting to eliminate the estimation bias caused by the missing data and construct sparse distributed M-estimator simultaneously. In the first stage, we consider a parametric propensity model and directly apply the -penalized CSL method to obtain an efficient and sparse distributed estimator of the propensity parameter. In the second stage, we construct an IPW-based -penalized CSL loss function to eliminate the bias and obtain the sparse M-estimation. The finite-sample performance of the estimators is studied through simulation, and an application to house sale prices data set is also presented.","PeriodicalId":54358,"journal":{"name":"Statistics","volume":"1 1","pages":"617 - 636"},"PeriodicalIF":1.2000,"publicationDate":"2023-04-12","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":"{\"title\":\"Two-stage communication-efficient distributed sparse M-estimation with missing data\",\"authors\":\"Xudong Zhang, Ting Zhang, Lei Wang\",\"doi\":\"10.1080/02331888.2023.2201505\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"Distributed estimation based on different sources of observations has drawn attention in the modern statistical learning. When the distributed data are missing at random, we propose a two-stage -penalized communication-efficient surrogate likelihood (CSL) algorithm based on inverse probability weighting to eliminate the estimation bias caused by the missing data and construct sparse distributed M-estimator simultaneously. In the first stage, we consider a parametric propensity model and directly apply the -penalized CSL method to obtain an efficient and sparse distributed estimator of the propensity parameter. In the second stage, we construct an IPW-based -penalized CSL loss function to eliminate the bias and obtain the sparse M-estimation. The finite-sample performance of the estimators is studied through simulation, and an application to house sale prices data set is also presented.\",\"PeriodicalId\":54358,\"journal\":{\"name\":\"Statistics\",\"volume\":\"1 1\",\"pages\":\"617 - 636\"},\"PeriodicalIF\":1.2000,\"publicationDate\":\"2023-04-12\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"0\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"Statistics\",\"FirstCategoryId\":\"100\",\"ListUrlMain\":\"https://doi.org/10.1080/02331888.2023.2201505\",\"RegionNum\":4,\"RegionCategory\":\"数学\",\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"Q2\",\"JCRName\":\"STATISTICS & PROBABILITY\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"Statistics","FirstCategoryId":"100","ListUrlMain":"https://doi.org/10.1080/02331888.2023.2201505","RegionNum":4,"RegionCategory":"数学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q2","JCRName":"STATISTICS & PROBABILITY","Score":null,"Total":0}
Two-stage communication-efficient distributed sparse M-estimation with missing data
Distributed estimation based on different sources of observations has drawn attention in the modern statistical learning. When the distributed data are missing at random, we propose a two-stage -penalized communication-efficient surrogate likelihood (CSL) algorithm based on inverse probability weighting to eliminate the estimation bias caused by the missing data and construct sparse distributed M-estimator simultaneously. In the first stage, we consider a parametric propensity model and directly apply the -penalized CSL method to obtain an efficient and sparse distributed estimator of the propensity parameter. In the second stage, we construct an IPW-based -penalized CSL loss function to eliminate the bias and obtain the sparse M-estimation. The finite-sample performance of the estimators is studied through simulation, and an application to house sale prices data set is also presented.
期刊介绍:
Statistics publishes papers developing and analysing new methods for any active field of statistics, motivated by real-life problems. Papers submitted for consideration should provide interesting and novel contributions to statistical theory and its applications with rigorous mathematical results and proofs. Moreover, numerical simulations and application to real data sets can improve the quality of papers, and should be included where appropriate. Statistics does not publish papers which represent mere application of existing procedures to case studies, and papers are required to contain methodological or theoretical innovation. Topics of interest include, for example, nonparametric statistics, time series, analysis of topological or functional data. Furthermore the journal also welcomes submissions in the field of theoretical econometrics and its links to mathematical statistics.