{"title":"Semiparametric Recovery of Central Dimension Reduction Space with Nonignorable Nonresponse","authors":"Siming Zheng, Alan T.K. Wan, Yong Zhou","doi":"10.1111/stan.12321","DOIUrl":null,"url":null,"abstract":"Sufficient dimension reduction (SDR) methods are effective tools for handling high dimensional data. Classical SDR methods are developed under the assumption that the data are completely observed. When the data are incomplete due to missing values, SDR has only been considered when the data are randomly missing, but not when they are non‐ignorably missing, which is arguably more difficult to handle due to the missing values' dependence on the reasons they are missing. The purpose of this paper is to fill this void. We propose an intuitive, easy‐to‐implement SDR estimator based on a semiparametric propensity score function for response data with non‐ignorable missing values. We refer to it as the dimension reduction‐based imputed estimator. We establish the theoretical properties of this estimator and examine its empirical performance via an extensive numerical study on real and simulated data. As well, we compare the performance of our proposed dimension reduction‐based imputed estimator with two competing estimators, including the fusion refined estimator and cumulative slicing estimator. A distinguishing feature of our method is that it requires no validation sample. The SDR theory developed in this paper is a non‐trivial extension of the existing literature, due to the technical challenges posed by non‐ignorable missingness. All the technical proofs of the theorems are given in the Online Supplementary Material.This article is protected by copyright. All rights reserved.","PeriodicalId":51178,"journal":{"name":"Statistica Neerlandica","volume":"1 1","pages":""},"PeriodicalIF":1.4000,"publicationDate":"2023-09-06","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Statistica Neerlandica","FirstCategoryId":"100","ListUrlMain":"https://doi.org/10.1111/stan.12321","RegionNum":3,"RegionCategory":"数学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q2","JCRName":"STATISTICS & PROBABILITY","Score":null,"Total":0}
引用次数: 0
Abstract
Sufficient dimension reduction (SDR) methods are effective tools for handling high dimensional data. Classical SDR methods are developed under the assumption that the data are completely observed. When the data are incomplete due to missing values, SDR has only been considered when the data are randomly missing, but not when they are non‐ignorably missing, which is arguably more difficult to handle due to the missing values' dependence on the reasons they are missing. The purpose of this paper is to fill this void. We propose an intuitive, easy‐to‐implement SDR estimator based on a semiparametric propensity score function for response data with non‐ignorable missing values. We refer to it as the dimension reduction‐based imputed estimator. We establish the theoretical properties of this estimator and examine its empirical performance via an extensive numerical study on real and simulated data. As well, we compare the performance of our proposed dimension reduction‐based imputed estimator with two competing estimators, including the fusion refined estimator and cumulative slicing estimator. A distinguishing feature of our method is that it requires no validation sample. The SDR theory developed in this paper is a non‐trivial extension of the existing literature, due to the technical challenges posed by non‐ignorable missingness. All the technical proofs of the theorems are given in the Online Supplementary Material.This article is protected by copyright. All rights reserved.
期刊介绍:
Statistica Neerlandica has been the journal of the Netherlands Society for Statistics and Operations Research since 1946. It covers all areas of statistics, from theoretical to applied, with a special emphasis on mathematical statistics, statistics for the behavioural sciences and biostatistics. This wide scope is reflected by the expertise of the journal’s editors representing these areas. The diverse editorial board is committed to a fast and fair reviewing process, and will judge submissions on quality, correctness, relevance and originality. Statistica Neerlandica encourages transparency and reproducibility, and offers online resources to make data, code, simulation results and other additional materials publicly available.