L1-Penalized Quantile Regression in High Dimensional Sparse Models

ERN: Other Econometrics: Single Equation Models (Topic) Pub Date : 2009-04-19 DOI:10.2139/ssrn.1394734

V. Chernozhukov, A. Belloni

{"title":"L1-Penalized Quantile Regression in High Dimensional Sparse Models","authors":"V. Chernozhukov, A. Belloni","doi":"10.2139/ssrn.1394734","DOIUrl":null,"url":null,"abstract":"We consider median regression and, more generally, quantile regression in high-dimensional sparse models. In these models the overall number of regressors p is very large, possibly larger than the sample size n, but only s of these regressors have non-zero impact on the conditional quantile of the response variable, where s grows slower than n. Since in this case the ordinary quantile regression is not consistent, we consider quantile regression penalized by the L1-norm of coefficients (L1-QR). First, we show that L1-QR is consistent at the rate of the square root of (s/n) log p, which is close to the oracle rate of the square root of (s/n), achievable when the minimal true model is known. The overall number of regressors p affects the rate only through the log p factor, thus allowing nearly exponential growth in the number of zero-impact regressors. The rate result holds under relatively weak conditions, requiring that s/n converges to zero at a super-logarithmic speed and that regularization parameter satisfies certain theoretical constraints. Second, we propose a pivotal, data-driven choice of the regularization parameter and show that it satisfies these theoretical constraints. Third, we show that L1-QR correctly selects the true minimal model as a valid submodel, when the non-zero coefficients of the true model are well separated from zero. We also show that the number of non-zero coefficients in L1-QR is of same stochastic order as s, the number of non-zero coefficients in the minimal true model. Fourth, we analyze the rate of convergence of a two-step estimator that applies ordinary quantile regression to the selected model. Fifth, we evaluate the performance of L1-QR in a Monte-Carlo experiment, and provide an application to the analysis of the international economic growth.","PeriodicalId":219959,"journal":{"name":"ERN: Other Econometrics: Single Equation Models (Topic)","volume":"56 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2009-04-19","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"460","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"ERN: Other Econometrics: Single Equation Models (Topic)","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.2139/ssrn.1394734","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}

引用次数: 460

Abstract

We consider median regression and, more generally, quantile regression in high-dimensional sparse models. In these models the overall number of regressors p is very large, possibly larger than the sample size n, but only s of these regressors have non-zero impact on the conditional quantile of the response variable, where s grows slower than n. Since in this case the ordinary quantile regression is not consistent, we consider quantile regression penalized by the L1-norm of coefficients (L1-QR). First, we show that L1-QR is consistent at the rate of the square root of (s/n) log p, which is close to the oracle rate of the square root of (s/n), achievable when the minimal true model is known. The overall number of regressors p affects the rate only through the log p factor, thus allowing nearly exponential growth in the number of zero-impact regressors. The rate result holds under relatively weak conditions, requiring that s/n converges to zero at a super-logarithmic speed and that regularization parameter satisfies certain theoretical constraints. Second, we propose a pivotal, data-driven choice of the regularization parameter and show that it satisfies these theoretical constraints. Third, we show that L1-QR correctly selects the true minimal model as a valid submodel, when the non-zero coefficients of the true model are well separated from zero. We also show that the number of non-zero coefficients in L1-QR is of same stochastic order as s, the number of non-zero coefficients in the minimal true model. Fourth, we analyze the rate of convergence of a two-step estimator that applies ordinary quantile regression to the selected model. Fifth, we evaluate the performance of L1-QR in a Monte-Carlo experiment, and provide an application to the analysis of the international economic growth.

查看原文本刊更多论文

高维稀疏模型的l1惩罚分位数回归

我们在高维稀疏模型中考虑中位数回归和更一般的分位数回归。在这些模型中，回归量的总数p非常大，可能大于样本量n，但这些回归量中只有s对响应变量的条件分位数有非零影响，其中s的增长速度比n慢。由于在这种情况下，普通分位数回归是不一致的，我们认为分位数回归受到系数的l1 -范数(L1-QR)的惩罚。首先，我们证明了L1-QR以平方根(s/n) log p的速率是一致的，这接近于平方根(s/n)的oracle速率，当最小真实模型已知时可以实现。回归量的总数p仅通过log p因子影响比率，因此允许零影响回归量的数量几乎呈指数增长。速率结果在相对较弱的条件下成立，要求s/n以超对数速度收敛于零，并且正则化参数满足一定的理论约束。其次，我们提出了一个关键的、数据驱动的正则化参数选择，并证明它满足这些理论约束。第三，当真模型的非零系数与零分离良好时，L1-QR正确地选择真最小模型作为有效子模型。我们还证明了L1-QR中的非零系数的数量与最小真模型中的非零系数的数量具有相同的随机顺序。第四，我们分析了将普通分位数回归应用于所选模型的两步估计器的收敛速度。第五，通过蒙特卡洛实验对L1-QR的性能进行了评价，并将其应用于国际经济增长分析。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

求助全文

约1分钟内获得全文求助全文

来源期刊

ERN: Other Econometrics: Single Equation Models (Topic)

自引率

0.00%

发文量