考虑缺失数据的两阶段高效通信分布稀疏m估计

IF 1.2 4区数学 Q2 STATISTICS & PROBABILITY

Statistics Pub Date : 2023-04-12 DOI:10.1080/02331888.2023.2201505

Xudong Zhang, Ting Zhang, Lei Wang

{"title":"考虑缺失数据的两阶段高效通信分布稀疏m估计","authors":"Xudong Zhang, Ting Zhang, Lei Wang","doi":"10.1080/02331888.2023.2201505","DOIUrl":null,"url":null,"abstract":"Distributed estimation based on different sources of observations has drawn attention in the modern statistical learning. When the distributed data are missing at random, we propose a two-stage -penalized communication-efficient surrogate likelihood (CSL) algorithm based on inverse probability weighting to eliminate the estimation bias caused by the missing data and construct sparse distributed M-estimator simultaneously. In the first stage, we consider a parametric propensity model and directly apply the -penalized CSL method to obtain an efficient and sparse distributed estimator of the propensity parameter. In the second stage, we construct an IPW-based -penalized CSL loss function to eliminate the bias and obtain the sparse M-estimation. The finite-sample performance of the estimators is studied through simulation, and an application to house sale prices data set is also presented.","PeriodicalId":54358,"journal":{"name":"Statistics","volume":"1 1","pages":"617 - 636"},"PeriodicalIF":1.2000,"publicationDate":"2023-04-12","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":"{\"title\":\"Two-stage communication-efficient distributed sparse M-estimation with missing data\",\"authors\":\"Xudong Zhang, Ting Zhang, Lei Wang\",\"doi\":\"10.1080/02331888.2023.2201505\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"Distributed estimation based on different sources of observations has drawn attention in the modern statistical learning. When the distributed data are missing at random, we propose a two-stage -penalized communication-efficient surrogate likelihood (CSL) algorithm based on inverse probability weighting to eliminate the estimation bias caused by the missing data and construct sparse distributed M-estimator simultaneously. In the first stage, we consider a parametric propensity model and directly apply the -penalized CSL method to obtain an efficient and sparse distributed estimator of the propensity parameter. In the second stage, we construct an IPW-based -penalized CSL loss function to eliminate the bias and obtain the sparse M-estimation. The finite-sample performance of the estimators is studied through simulation, and an application to house sale prices data set is also presented.\",\"PeriodicalId\":54358,\"journal\":{\"name\":\"Statistics\",\"volume\":\"1 1\",\"pages\":\"617 - 636\"},\"PeriodicalIF\":1.2000,\"publicationDate\":\"2023-04-12\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"0\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"Statistics\",\"FirstCategoryId\":\"100\",\"ListUrlMain\":\"https://doi.org/10.1080/02331888.2023.2201505\",\"RegionNum\":4,\"RegionCategory\":\"数学\",\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"Q2\",\"JCRName\":\"STATISTICS & PROBABILITY\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"Statistics","FirstCategoryId":"100","ListUrlMain":"https://doi.org/10.1080/02331888.2023.2201505","RegionNum":4,"RegionCategory":"数学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q2","JCRName":"STATISTICS & PROBABILITY","Score":null,"Total":0}

引用次数: 0

摘要

基于不同观测源的分布式估计在现代统计学习中引起了广泛的关注。当分布数据随机缺失时，提出了一种基于逆概率加权的两阶段惩罚的通信高效代理似然(CSL)算法，以消除丢失数据引起的估计偏差，同时构造稀疏分布m估计器。在第一阶段，我们考虑一个参数倾向模型，直接应用-惩罚CSL方法得到倾向参数的高效稀疏分布估计。在第二阶段，我们构造了一个基于ipw的惩罚CSL损失函数来消除偏差并获得稀疏的m估计。通过仿真研究了估计器的有限样本性能，并给出了在房屋销售价格数据集上的应用。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

查看原文本刊更多论文

Two-stage communication-efficient distributed sparse M-estimation with missing data

Distributed estimation based on different sources of observations has drawn attention in the modern statistical learning. When the distributed data are missing at random, we propose a two-stage -penalized communication-efficient surrogate likelihood (CSL) algorithm based on inverse probability weighting to eliminate the estimation bias caused by the missing data and construct sparse distributed M-estimator simultaneously. In the first stage, we consider a parametric propensity model and directly apply the -penalized CSL method to obtain an efficient and sparse distributed estimator of the propensity parameter. In the second stage, we construct an IPW-based -penalized CSL loss function to eliminate the bias and obtain the sparse M-estimation. The finite-sample performance of the estimators is studied through simulation, and an application to house sale prices data set is also presented.

求助全文

通过发布文献求助，成功后即可免费获取论文全文。去求助

来源期刊

Statistics 数学-统计学与概率论

CiteScore

1.00

自引率

0.00%

发文量

审稿时长

12 months

期刊介绍： Statistics publishes papers developing and analysing new methods for any active field of statistics, motivated by real-life problems. Papers submitted for consideration should provide interesting and novel contributions to statistical theory and its applications with rigorous mathematical results and proofs. Moreover, numerical simulations and application to real data sets can improve the quality of papers, and should be included where appropriate. Statistics does not publish papers which represent mere application of existing procedures to case studies, and papers are required to contain methodological or theoretical innovation. Topics of interest include, for example, nonparametric statistics, time series, analysis of topological or functional data. Furthermore the journal also welcomes submissions in the field of theoretical econometrics and its links to mathematical statistics.