Jianqing FanPrinceton University, Weining WangUniversity of Groningen, Yue ZhaoUniversity of York
{"title":"通过神经因子回归进行条件非参数变量筛选","authors":"Jianqing FanPrinceton University, Weining WangUniversity of Groningen, Yue ZhaoUniversity of York","doi":"arxiv-2408.10825","DOIUrl":null,"url":null,"abstract":"High-dimensional covariates often admit linear factor structure. To\neffectively screen correlated covariates in high-dimension, we propose a\nconditional variable screening test based on non-parametric regression using\nneural networks due to their representation power. We ask the question whether\nindividual covariates have additional contributions given the latent factors or\nmore generally a set of variables. Our test statistics are based on the\nestimated partial derivative of the regression function of the candidate\nvariable for screening and a observable proxy for the latent factors. Hence,\nour test reveals how much predictors contribute additionally to the\nnon-parametric regression after accounting for the latent factors. Our\nderivative estimator is the convolution of a deep neural network regression\nestimator and a smoothing kernel. We demonstrate that when the neural network\nsize diverges with the sample size, unlike estimating the regression function\nitself, it is necessary to smooth the partial derivative of the neural network\nestimator to recover the desired convergence rate for the derivative. Moreover,\nour screening test achieves asymptotic normality under the null after finely\ncentering our test statistics that makes the biases negligible, as well as\nconsistency for local alternatives under mild conditions. We demonstrate the\nperformance of our test in a simulation study and two real world applications.","PeriodicalId":501293,"journal":{"name":"arXiv - ECON - Econometrics","volume":"1587 1","pages":""},"PeriodicalIF":0.0000,"publicationDate":"2024-08-20","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":"{\"title\":\"Conditional nonparametric variable screening by neural factor regression\",\"authors\":\"Jianqing FanPrinceton University, Weining WangUniversity of Groningen, Yue ZhaoUniversity of York\",\"doi\":\"arxiv-2408.10825\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"High-dimensional covariates often admit linear factor structure. To\\neffectively screen correlated covariates in high-dimension, we propose a\\nconditional variable screening test based on non-parametric regression using\\nneural networks due to their representation power. We ask the question whether\\nindividual covariates have additional contributions given the latent factors or\\nmore generally a set of variables. Our test statistics are based on the\\nestimated partial derivative of the regression function of the candidate\\nvariable for screening and a observable proxy for the latent factors. Hence,\\nour test reveals how much predictors contribute additionally to the\\nnon-parametric regression after accounting for the latent factors. Our\\nderivative estimator is the convolution of a deep neural network regression\\nestimator and a smoothing kernel. We demonstrate that when the neural network\\nsize diverges with the sample size, unlike estimating the regression function\\nitself, it is necessary to smooth the partial derivative of the neural network\\nestimator to recover the desired convergence rate for the derivative. Moreover,\\nour screening test achieves asymptotic normality under the null after finely\\ncentering our test statistics that makes the biases negligible, as well as\\nconsistency for local alternatives under mild conditions. We demonstrate the\\nperformance of our test in a simulation study and two real world applications.\",\"PeriodicalId\":501293,\"journal\":{\"name\":\"arXiv - ECON - Econometrics\",\"volume\":\"1587 1\",\"pages\":\"\"},\"PeriodicalIF\":0.0000,\"publicationDate\":\"2024-08-20\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"0\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"arXiv - ECON - Econometrics\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/arxiv-2408.10825\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"arXiv - ECON - Econometrics","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/arxiv-2408.10825","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
Conditional nonparametric variable screening by neural factor regression
High-dimensional covariates often admit linear factor structure. To
effectively screen correlated covariates in high-dimension, we propose a
conditional variable screening test based on non-parametric regression using
neural networks due to their representation power. We ask the question whether
individual covariates have additional contributions given the latent factors or
more generally a set of variables. Our test statistics are based on the
estimated partial derivative of the regression function of the candidate
variable for screening and a observable proxy for the latent factors. Hence,
our test reveals how much predictors contribute additionally to the
non-parametric regression after accounting for the latent factors. Our
derivative estimator is the convolution of a deep neural network regression
estimator and a smoothing kernel. We demonstrate that when the neural network
size diverges with the sample size, unlike estimating the regression function
itself, it is necessary to smooth the partial derivative of the neural network
estimator to recover the desired convergence rate for the derivative. Moreover,
our screening test achieves asymptotic normality under the null after finely
centering our test statistics that makes the biases negligible, as well as
consistency for local alternatives under mild conditions. We demonstrate the
performance of our test in a simulation study and two real world applications.