{"title":"Distribution-free and model-free multivariate feature screening via multivariate rank distance correlation","authors":"Shaofei Zhao, Guifang Fu","doi":"10.1016/j.jmva.2022.105081","DOIUrl":null,"url":null,"abstract":"<div><p>Feature screening approaches are effective in selecting active features from data with ultrahigh dimensionality and increasing complexity; however, many existing feature screening approaches are either restricted to a univariate response or rely on some distribution or model assumptions. In this article, we propose a sure independence screening approach based on the multivariate rank distance correlation (MrDc-SIS). The MrDc-SIS achieves multiple desirable properties such as being distribution-free, completely nonparametric, scale-free and robust for outliers or heavy tails. Moreover, the MrDc-SIS can be used to screen either univariate or multivariate responses and either one dimensional or multi-dimensional predictors. We establish the theoretical sure screening and rank consistency properties of the MrDc-SIS approach under a mild condition by lifting previous assumptions about the finite moments. Simulation studies demonstrate that MrDc-SIS outperforms eight other closely relevant approaches under certain settings. We also apply the MrDc-SIS approach to a multi-omics ovarian carcinoma data downloaded from The Cancer Genome Atlas (TCGA).</p></div>","PeriodicalId":16431,"journal":{"name":"Journal of Multivariate Analysis","volume":"192 ","pages":"Article 105081"},"PeriodicalIF":1.4000,"publicationDate":"2022-11-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"3","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Journal of Multivariate Analysis","FirstCategoryId":"100","ListUrlMain":"https://www.sciencedirect.com/science/article/pii/S0047259X22000811","RegionNum":3,"RegionCategory":"数学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q2","JCRName":"STATISTICS & PROBABILITY","Score":null,"Total":0}
引用次数: 3
Abstract
Feature screening approaches are effective in selecting active features from data with ultrahigh dimensionality and increasing complexity; however, many existing feature screening approaches are either restricted to a univariate response or rely on some distribution or model assumptions. In this article, we propose a sure independence screening approach based on the multivariate rank distance correlation (MrDc-SIS). The MrDc-SIS achieves multiple desirable properties such as being distribution-free, completely nonparametric, scale-free and robust for outliers or heavy tails. Moreover, the MrDc-SIS can be used to screen either univariate or multivariate responses and either one dimensional or multi-dimensional predictors. We establish the theoretical sure screening and rank consistency properties of the MrDc-SIS approach under a mild condition by lifting previous assumptions about the finite moments. Simulation studies demonstrate that MrDc-SIS outperforms eight other closely relevant approaches under certain settings. We also apply the MrDc-SIS approach to a multi-omics ovarian carcinoma data downloaded from The Cancer Genome Atlas (TCGA).
期刊介绍:
Founded in 1971, the Journal of Multivariate Analysis (JMVA) is the central venue for the publication of new, relevant methodology and particularly innovative applications pertaining to the analysis and interpretation of multidimensional data.
The journal welcomes contributions to all aspects of multivariate data analysis and modeling, including cluster analysis, discriminant analysis, factor analysis, and multidimensional continuous or discrete distribution theory. Topics of current interest include, but are not limited to, inferential aspects of
Copula modeling
Functional data analysis
Graphical modeling
High-dimensional data analysis
Image analysis
Multivariate extreme-value theory
Sparse modeling
Spatial statistics.