Conditional nonparametric variable screening by neural factor regression

arXiv - ECON - Econometrics Pub Date : 2024-08-20 DOI:arxiv-2408.10825

Jianqing FanPrinceton University, Weining WangUniversity of Groningen, Yue ZhaoUniversity of York

{"title":"Conditional nonparametric variable screening by neural factor regression","authors":"Jianqing FanPrinceton University, Weining WangUniversity of Groningen, Yue ZhaoUniversity of York","doi":"arxiv-2408.10825","DOIUrl":null,"url":null,"abstract":"High-dimensional covariates often admit linear factor structure. To\neffectively screen correlated covariates in high-dimension, we propose a\nconditional variable screening test based on non-parametric regression using\nneural networks due to their representation power. We ask the question whether\nindividual covariates have additional contributions given the latent factors or\nmore generally a set of variables. Our test statistics are based on the\nestimated partial derivative of the regression function of the candidate\nvariable for screening and a observable proxy for the latent factors. Hence,\nour test reveals how much predictors contribute additionally to the\nnon-parametric regression after accounting for the latent factors. Our\nderivative estimator is the convolution of a deep neural network regression\nestimator and a smoothing kernel. We demonstrate that when the neural network\nsize diverges with the sample size, unlike estimating the regression function\nitself, it is necessary to smooth the partial derivative of the neural network\nestimator to recover the desired convergence rate for the derivative. Moreover,\nour screening test achieves asymptotic normality under the null after finely\ncentering our test statistics that makes the biases negligible, as well as\nconsistency for local alternatives under mild conditions. We demonstrate the\nperformance of our test in a simulation study and two real world applications.","PeriodicalId":501293,"journal":{"name":"arXiv - ECON - Econometrics","volume":"1587 1","pages":""},"PeriodicalIF":0.0000,"publicationDate":"2024-08-20","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"arXiv - ECON - Econometrics","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/arxiv-2408.10825","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}

引用次数: 0

Abstract

High-dimensional covariates often admit linear factor structure. To effectively screen correlated covariates in high-dimension, we propose a conditional variable screening test based on non-parametric regression using neural networks due to their representation power. We ask the question whether individual covariates have additional contributions given the latent factors or more generally a set of variables. Our test statistics are based on the estimated partial derivative of the regression function of the candidate variable for screening and a observable proxy for the latent factors. Hence, our test reveals how much predictors contribute additionally to the non-parametric regression after accounting for the latent factors. Our derivative estimator is the convolution of a deep neural network regression estimator and a smoothing kernel. We demonstrate that when the neural network size diverges with the sample size, unlike estimating the regression function itself, it is necessary to smooth the partial derivative of the neural network estimator to recover the desired convergence rate for the derivative. Moreover, our screening test achieves asymptotic normality under the null after finely centering our test statistics that makes the biases negligible, as well as consistency for local alternatives under mild conditions. We demonstrate the performance of our test in a simulation study and two real world applications.

查看原文本刊更多论文

通过神经因子回归进行条件非参数变量筛选

高维协变量通常具有线性因子结构。为了有效筛选高维相关协变量，我们提出了一种基于非参数回归的条件变量筛选测试，利用神经网络的表征能力进行筛选。我们提出的问题是，在潜在因素或更广泛的变量集合中，单个协变量是否有额外的贡献。我们的检验统计基于筛选候选变量的回归函数的估计偏导数和潜在因素的可观测替代变量。因此，我们的检验揭示了在考虑潜在因素后，预测因子对非参数回归的额外贡献程度。我们的衍生估计器是深度神经网络回归估计器和平滑核的卷积。我们证明，当神经网络大小随样本大小发散时，与估计回归函数本身不同，有必要平滑神经网络估计器的偏导数，以恢复所需的导数收敛速率。此外，我们的筛选检验在对检验统计量进行精细中心化处理后，实现了空值下的渐近正态性，使偏差可以忽略不计，并在温和条件下实现了局部替代的一致性。我们在一项模拟研究和两个实际应用中证明了我们的测试性能。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

求助全文

约1分钟内获得全文求助全文

来源期刊

arXiv - ECON - Econometrics

自引率

0.00%

发文量