{"title":"序贯QP LP与Slack变量分析提升svm效率","authors":"Amit Kumar Kundu, Rezaul Karim, Ali Ahmed Ave","doi":"10.3844/jcssp.2023.1253.1262","DOIUrl":null,"url":null,"abstract":"Support Vector Machine (SVM) is a highly attractive algorithm among many machine learning models due to its generalization power and classification performance based on sound mathematical formulation being convex that offers global minimum. However, despite being sparse, its high classification cost from kernel execution with Support Vectors (SVs) reduces the user's interest when there are hard computational constraints in the application, especially, for large and difficult data. So far in our knowledge, out of many existing works to overcome this problem, some are really interesting and heavy but get less attractive due to improper training difficulties for example, excessive cost-memory requirement, initialization, and parameter selection trouble because of the non-convexity of the problems while the other few that avoid these problems, cannot generate sparsity and complexity simultaneously of the final discriminator upto satisfactory level for very large and tricky data. In this direction, we propose a novel algorithm Efficiency Escalated SVM (EESVM) that solves two convex problems using Quadratic Programming (QP) and Linear Programming (LP) in sequence. This is followed by computational analysis on the remaining smallest set of slack variables that ultimately build two very essential properties of the machine: (i) Highly efficient by being heavily sparse and optimally complex and (ii) Able to handle very large and noise-effected complicated data. Benchmarking shows that this EESVM demands kernel computation as little as 6.8% of the standard QPSVM while posing almost the same classification accuracy on test data and requiring 42.7, 27.7 and 46.6% that of other three implemented state-of-the-art heavy-sparse machines while offering similar classification accuracy. It claims the lowest Machine Accuracy Cost (MAC) value among all of these machines though showing very similar generalization performance that is evaluated numerically using the term Generalization Failure Rate (GFR). Being quite pragmatic for modern technological advancement, it is indispensable for optimum manipulation of the troublesome massive, and difficult data.","PeriodicalId":40005,"journal":{"name":"Journal of Computer Science","volume":"125 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2023-10-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":"{\"title\":\"Escalating SVM-Efficiency by Sequential QP LP and Slack Variable Analysis\",\"authors\":\"Amit Kumar Kundu, Rezaul Karim, Ali Ahmed Ave\",\"doi\":\"10.3844/jcssp.2023.1253.1262\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"Support Vector Machine (SVM) is a highly attractive algorithm among many machine learning models due to its generalization power and classification performance based on sound mathematical formulation being convex that offers global minimum. However, despite being sparse, its high classification cost from kernel execution with Support Vectors (SVs) reduces the user's interest when there are hard computational constraints in the application, especially, for large and difficult data. So far in our knowledge, out of many existing works to overcome this problem, some are really interesting and heavy but get less attractive due to improper training difficulties for example, excessive cost-memory requirement, initialization, and parameter selection trouble because of the non-convexity of the problems while the other few that avoid these problems, cannot generate sparsity and complexity simultaneously of the final discriminator upto satisfactory level for very large and tricky data. In this direction, we propose a novel algorithm Efficiency Escalated SVM (EESVM) that solves two convex problems using Quadratic Programming (QP) and Linear Programming (LP) in sequence. This is followed by computational analysis on the remaining smallest set of slack variables that ultimately build two very essential properties of the machine: (i) Highly efficient by being heavily sparse and optimally complex and (ii) Able to handle very large and noise-effected complicated data. Benchmarking shows that this EESVM demands kernel computation as little as 6.8% of the standard QPSVM while posing almost the same classification accuracy on test data and requiring 42.7, 27.7 and 46.6% that of other three implemented state-of-the-art heavy-sparse machines while offering similar classification accuracy. It claims the lowest Machine Accuracy Cost (MAC) value among all of these machines though showing very similar generalization performance that is evaluated numerically using the term Generalization Failure Rate (GFR). Being quite pragmatic for modern technological advancement, it is indispensable for optimum manipulation of the troublesome massive, and difficult data.\",\"PeriodicalId\":40005,\"journal\":{\"name\":\"Journal of Computer Science\",\"volume\":\"125 1\",\"pages\":\"0\"},\"PeriodicalIF\":0.0000,\"publicationDate\":\"2023-10-01\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"0\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"Journal of Computer Science\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/10.3844/jcssp.2023.1253.1262\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"Journal of Computer Science","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.3844/jcssp.2023.1253.1262","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
Escalating SVM-Efficiency by Sequential QP LP and Slack Variable Analysis
Support Vector Machine (SVM) is a highly attractive algorithm among many machine learning models due to its generalization power and classification performance based on sound mathematical formulation being convex that offers global minimum. However, despite being sparse, its high classification cost from kernel execution with Support Vectors (SVs) reduces the user's interest when there are hard computational constraints in the application, especially, for large and difficult data. So far in our knowledge, out of many existing works to overcome this problem, some are really interesting and heavy but get less attractive due to improper training difficulties for example, excessive cost-memory requirement, initialization, and parameter selection trouble because of the non-convexity of the problems while the other few that avoid these problems, cannot generate sparsity and complexity simultaneously of the final discriminator upto satisfactory level for very large and tricky data. In this direction, we propose a novel algorithm Efficiency Escalated SVM (EESVM) that solves two convex problems using Quadratic Programming (QP) and Linear Programming (LP) in sequence. This is followed by computational analysis on the remaining smallest set of slack variables that ultimately build two very essential properties of the machine: (i) Highly efficient by being heavily sparse and optimally complex and (ii) Able to handle very large and noise-effected complicated data. Benchmarking shows that this EESVM demands kernel computation as little as 6.8% of the standard QPSVM while posing almost the same classification accuracy on test data and requiring 42.7, 27.7 and 46.6% that of other three implemented state-of-the-art heavy-sparse machines while offering similar classification accuracy. It claims the lowest Machine Accuracy Cost (MAC) value among all of these machines though showing very similar generalization performance that is evaluated numerically using the term Generalization Failure Rate (GFR). Being quite pragmatic for modern technological advancement, it is indispensable for optimum manipulation of the troublesome massive, and difficult data.
期刊介绍:
Journal of Computer Science is aimed to publish research articles on theoretical foundations of information and computation, and of practical techniques for their implementation and application in computer systems. JCS updated twelve times a year and is a peer reviewed journal covers the latest and most compelling research of the time.