{"title":"通过ℓ0-惩罚的经验风险最小化","authors":"Le‐Yu Chen, S. Lee","doi":"10.1093/ectj/utaa017","DOIUrl":null,"url":null,"abstract":"\n We consider the problem of binary classification with covariate selection. We construct a classification procedure by minimising the empirical misclassification risk with a penalty on the number of selected covariates. This optimisation problem is equivalent to obtaining an ℓ0-penalised maximum score estimator. We derive probability bounds on the estimated sparsity as well as on the excess misclassification risk. These theoretical results are nonasymptotic and established in a high-dimensional setting. In particular, we show that our method yields a sparse solution whose ℓ0-norm can be arbitrarily close to true sparsity with high probability and obtain the rates of convergence for the excess misclassification risk. We implement the proposed procedure via the method of mixed-integer linear programming. Its numerical performance is illustrated in Monte Carlo experiments and a real data application of the work-trip transportation mode choice.","PeriodicalId":50555,"journal":{"name":"Econometrics Journal","volume":" ","pages":""},"PeriodicalIF":2.9000,"publicationDate":"2020-06-20","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://sci-hub-pdf.com/10.1093/ectj/utaa017","citationCount":"7","resultStr":"{\"title\":\"Binary classification with covariate selection through ℓ0-penalised empirical risk minimisation\",\"authors\":\"Le‐Yu Chen, S. Lee\",\"doi\":\"10.1093/ectj/utaa017\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"\\n We consider the problem of binary classification with covariate selection. We construct a classification procedure by minimising the empirical misclassification risk with a penalty on the number of selected covariates. This optimisation problem is equivalent to obtaining an ℓ0-penalised maximum score estimator. We derive probability bounds on the estimated sparsity as well as on the excess misclassification risk. These theoretical results are nonasymptotic and established in a high-dimensional setting. In particular, we show that our method yields a sparse solution whose ℓ0-norm can be arbitrarily close to true sparsity with high probability and obtain the rates of convergence for the excess misclassification risk. We implement the proposed procedure via the method of mixed-integer linear programming. Its numerical performance is illustrated in Monte Carlo experiments and a real data application of the work-trip transportation mode choice.\",\"PeriodicalId\":50555,\"journal\":{\"name\":\"Econometrics Journal\",\"volume\":\" \",\"pages\":\"\"},\"PeriodicalIF\":2.9000,\"publicationDate\":\"2020-06-20\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"https://sci-hub-pdf.com/10.1093/ectj/utaa017\",\"citationCount\":\"7\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"Econometrics Journal\",\"FirstCategoryId\":\"96\",\"ListUrlMain\":\"https://doi.org/10.1093/ectj/utaa017\",\"RegionNum\":4,\"RegionCategory\":\"经济学\",\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"Q1\",\"JCRName\":\"ECONOMICS\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"Econometrics Journal","FirstCategoryId":"96","ListUrlMain":"https://doi.org/10.1093/ectj/utaa017","RegionNum":4,"RegionCategory":"经济学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q1","JCRName":"ECONOMICS","Score":null,"Total":0}
Binary classification with covariate selection through ℓ0-penalised empirical risk minimisation
We consider the problem of binary classification with covariate selection. We construct a classification procedure by minimising the empirical misclassification risk with a penalty on the number of selected covariates. This optimisation problem is equivalent to obtaining an ℓ0-penalised maximum score estimator. We derive probability bounds on the estimated sparsity as well as on the excess misclassification risk. These theoretical results are nonasymptotic and established in a high-dimensional setting. In particular, we show that our method yields a sparse solution whose ℓ0-norm can be arbitrarily close to true sparsity with high probability and obtain the rates of convergence for the excess misclassification risk. We implement the proposed procedure via the method of mixed-integer linear programming. Its numerical performance is illustrated in Monte Carlo experiments and a real data application of the work-trip transportation mode choice.
期刊介绍:
The Econometrics Journal was established in 1998 by the Royal Economic Society with the aim of creating a top international field journal for the publication of econometric research with a standard of intellectual rigour and academic standing similar to those of the pre-existing top field journals in econometrics. The Econometrics Journal is committed to publishing first-class papers in macro-, micro- and financial econometrics. It is a general journal for econometric research open to all areas of econometrics, whether applied, computational, methodological or theoretical contributions.