{"title":"高维惩罚伯恩斯坦支持向量分类器","authors":"Rachid Kharoubi, Abdallah Mkhadri, Karim Oualkacha","doi":"10.1007/s00180-023-01448-z","DOIUrl":null,"url":null,"abstract":"<p>The support vector machine (SVM) is a powerful classifier used for binary classification to improve the prediction accuracy. However, the nondifferentiability of the SVM hinge loss function can lead to computational difficulties in high-dimensional settings. To overcome this problem, we rely on the Bernstein polynomial and propose a new smoothed version of the SVM hinge loss called the Bernstein support vector machine (BernSVC). This extension is suitable for the high dimension regime. As the BernSVC objective loss function is twice differentiable everywhere, we propose two efficient algorithms for computing the solution of the penalized BernSVC. The first algorithm is based on coordinate descent with the maximization-majorization principle and the second algorithm is the iterative reweighted least squares-type algorithm. Under standard assumptions, we derive a cone condition and a restricted strong convexity to establish an upper bound for the weighted lasso BernSVC estimator. By using a local linear approximation, we extend the latter result to the penalized BernSVC with nonconvex penalties SCAD and MCP. Our bound holds with high probability and achieves the so-called fast rate under mild conditions on the design matrix. Simulation studies are considered to illustrate the prediction accuracy of BernSVC relative to its competitors and also to compare the performance of the two algorithms in terms of computational timing and error estimation. The use of the proposed method is illustrated through analysis of three large-scale real data examples.</p>","PeriodicalId":55223,"journal":{"name":"Computational Statistics","volume":"262 1","pages":""},"PeriodicalIF":1.0000,"publicationDate":"2024-01-16","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":"{\"title\":\"High-dimensional penalized Bernstein support vector classifier\",\"authors\":\"Rachid Kharoubi, Abdallah Mkhadri, Karim Oualkacha\",\"doi\":\"10.1007/s00180-023-01448-z\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"<p>The support vector machine (SVM) is a powerful classifier used for binary classification to improve the prediction accuracy. However, the nondifferentiability of the SVM hinge loss function can lead to computational difficulties in high-dimensional settings. To overcome this problem, we rely on the Bernstein polynomial and propose a new smoothed version of the SVM hinge loss called the Bernstein support vector machine (BernSVC). This extension is suitable for the high dimension regime. As the BernSVC objective loss function is twice differentiable everywhere, we propose two efficient algorithms for computing the solution of the penalized BernSVC. The first algorithm is based on coordinate descent with the maximization-majorization principle and the second algorithm is the iterative reweighted least squares-type algorithm. Under standard assumptions, we derive a cone condition and a restricted strong convexity to establish an upper bound for the weighted lasso BernSVC estimator. By using a local linear approximation, we extend the latter result to the penalized BernSVC with nonconvex penalties SCAD and MCP. Our bound holds with high probability and achieves the so-called fast rate under mild conditions on the design matrix. Simulation studies are considered to illustrate the prediction accuracy of BernSVC relative to its competitors and also to compare the performance of the two algorithms in terms of computational timing and error estimation. The use of the proposed method is illustrated through analysis of three large-scale real data examples.</p>\",\"PeriodicalId\":55223,\"journal\":{\"name\":\"Computational Statistics\",\"volume\":\"262 1\",\"pages\":\"\"},\"PeriodicalIF\":1.0000,\"publicationDate\":\"2024-01-16\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"0\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"Computational Statistics\",\"FirstCategoryId\":\"100\",\"ListUrlMain\":\"https://doi.org/10.1007/s00180-023-01448-z\",\"RegionNum\":4,\"RegionCategory\":\"数学\",\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"Q3\",\"JCRName\":\"STATISTICS & PROBABILITY\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"Computational Statistics","FirstCategoryId":"100","ListUrlMain":"https://doi.org/10.1007/s00180-023-01448-z","RegionNum":4,"RegionCategory":"数学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q3","JCRName":"STATISTICS & PROBABILITY","Score":null,"Total":0}
High-dimensional penalized Bernstein support vector classifier
The support vector machine (SVM) is a powerful classifier used for binary classification to improve the prediction accuracy. However, the nondifferentiability of the SVM hinge loss function can lead to computational difficulties in high-dimensional settings. To overcome this problem, we rely on the Bernstein polynomial and propose a new smoothed version of the SVM hinge loss called the Bernstein support vector machine (BernSVC). This extension is suitable for the high dimension regime. As the BernSVC objective loss function is twice differentiable everywhere, we propose two efficient algorithms for computing the solution of the penalized BernSVC. The first algorithm is based on coordinate descent with the maximization-majorization principle and the second algorithm is the iterative reweighted least squares-type algorithm. Under standard assumptions, we derive a cone condition and a restricted strong convexity to establish an upper bound for the weighted lasso BernSVC estimator. By using a local linear approximation, we extend the latter result to the penalized BernSVC with nonconvex penalties SCAD and MCP. Our bound holds with high probability and achieves the so-called fast rate under mild conditions on the design matrix. Simulation studies are considered to illustrate the prediction accuracy of BernSVC relative to its competitors and also to compare the performance of the two algorithms in terms of computational timing and error estimation. The use of the proposed method is illustrated through analysis of three large-scale real data examples.
期刊介绍:
Computational Statistics (CompStat) is an international journal which promotes the publication of applications and methodological research in the field of Computational Statistics. The focus of papers in CompStat is on the contribution to and influence of computing on statistics and vice versa. The journal provides a forum for computer scientists, mathematicians, and statisticians in a variety of fields of statistics such as biometrics, econometrics, data analysis, graphics, simulation, algorithms, knowledge based systems, and Bayesian computing. CompStat publishes hardware, software plus package reports.