{"title":"A novel martingale difference correlation via data splitting with applications in feature screening","authors":"Zhengyu Zhu , Jicai Liu , Riquan Zhang","doi":"10.1016/j.jmva.2025.105508","DOIUrl":null,"url":null,"abstract":"<div><div>In this paper, we introduce a novel sample martingale difference correlation via data splitting to measure the departure of conditional mean independence between a response variable <span><math><mi>Y</mi></math></span> and a vector predictor <span><math><mi>X</mi></math></span>. The proposed correlation converges to zero and has an asymptotically symmetric sampling distribution around zero when <span><math><mi>Y</mi></math></span> and <span><math><mi>X</mi></math></span> are conditionally mean independent. In contrast, it converges to a positive value when <span><math><mi>Y</mi></math></span> and <span><math><mi>X</mi></math></span> are conditionally mean dependent. Leveraging these properties, we develop a new model-free feature screening method with false discovery rate (FDR) control for ultrahigh-dimensional data. We demonstrate that this screening method achieves FDR control and the sure screening property simultaneously. We also extend our approach to conditional quantile screening with FDR control. To further enhance the stability of the screening results, we implement multiple splitting techniques. We evaluate the finite sample performance of our proposed methods through simulations and real data analyses, and compare them with existing methods.</div></div>","PeriodicalId":16431,"journal":{"name":"Journal of Multivariate Analysis","volume":"211 ","pages":"Article 105508"},"PeriodicalIF":1.4000,"publicationDate":"2025-09-19","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Journal of Multivariate Analysis","FirstCategoryId":"100","ListUrlMain":"https://www.sciencedirect.com/science/article/pii/S0047259X25001034","RegionNum":3,"RegionCategory":"数学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q2","JCRName":"STATISTICS & PROBABILITY","Score":null,"Total":0}
引用次数: 0
Abstract
In this paper, we introduce a novel sample martingale difference correlation via data splitting to measure the departure of conditional mean independence between a response variable and a vector predictor . The proposed correlation converges to zero and has an asymptotically symmetric sampling distribution around zero when and are conditionally mean independent. In contrast, it converges to a positive value when and are conditionally mean dependent. Leveraging these properties, we develop a new model-free feature screening method with false discovery rate (FDR) control for ultrahigh-dimensional data. We demonstrate that this screening method achieves FDR control and the sure screening property simultaneously. We also extend our approach to conditional quantile screening with FDR control. To further enhance the stability of the screening results, we implement multiple splitting techniques. We evaluate the finite sample performance of our proposed methods through simulations and real data analyses, and compare them with existing methods.
期刊介绍:
Founded in 1971, the Journal of Multivariate Analysis (JMVA) is the central venue for the publication of new, relevant methodology and particularly innovative applications pertaining to the analysis and interpretation of multidimensional data.
The journal welcomes contributions to all aspects of multivariate data analysis and modeling, including cluster analysis, discriminant analysis, factor analysis, and multidimensional continuous or discrete distribution theory. Topics of current interest include, but are not limited to, inferential aspects of
Copula modeling
Functional data analysis
Graphical modeling
High-dimensional data analysis
Image analysis
Multivariate extreme-value theory
Sparse modeling
Spatial statistics.