序贯QP LP与Slack变量分析提升svm效率

Journal of Computer Science Pub Date : 2023-10-01 DOI:10.3844/jcssp.2023.1253.1262

Amit Kumar Kundu, Rezaul Karim, Ali Ahmed Ave

{"title":"序贯QP LP与Slack变量分析提升svm效率","authors":"Amit Kumar Kundu, Rezaul Karim, Ali Ahmed Ave","doi":"10.3844/jcssp.2023.1253.1262","DOIUrl":null,"url":null,"abstract":"Support Vector Machine (SVM) is a highly attractive algorithm among many machine learning models due to its generalization power and classification performance based on sound mathematical formulation being convex that offers global minimum. However, despite being sparse, its high classification cost from kernel execution with Support Vectors (SVs) reduces the user's interest when there are hard computational constraints in the application, especially, for large and difficult data. So far in our knowledge, out of many existing works to overcome this problem, some are really interesting and heavy but get less attractive due to improper training difficulties for example, excessive cost-memory requirement, initialization, and parameter selection trouble because of the non-convexity of the problems while the other few that avoid these problems, cannot generate sparsity and complexity simultaneously of the final discriminator upto satisfactory level for very large and tricky data. In this direction, we propose a novel algorithm Efficiency Escalated SVM (EESVM) that solves two convex problems using Quadratic Programming (QP) and Linear Programming (LP) in sequence. This is followed by computational analysis on the remaining smallest set of slack variables that ultimately build two very essential properties of the machine: (i) Highly efficient by being heavily sparse and optimally complex and (ii) Able to handle very large and noise-effected complicated data. Benchmarking shows that this EESVM demands kernel computation as little as 6.8% of the standard QPSVM while posing almost the same classification accuracy on test data and requiring 42.7, 27.7 and 46.6% that of other three implemented state-of-the-art heavy-sparse machines while offering similar classification accuracy. It claims the lowest Machine Accuracy Cost (MAC) value among all of these machines though showing very similar generalization performance that is evaluated numerically using the term Generalization Failure Rate (GFR). Being quite pragmatic for modern technological advancement, it is indispensable for optimum manipulation of the troublesome massive, and difficult data.","PeriodicalId":40005,"journal":{"name":"Journal of Computer Science","volume":"125 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2023-10-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":"{\"title\":\"Escalating SVM-Efficiency by Sequential QP LP and Slack Variable Analysis\",\"authors\":\"Amit Kumar Kundu, Rezaul Karim, Ali Ahmed Ave\",\"doi\":\"10.3844/jcssp.2023.1253.1262\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"Support Vector Machine (SVM) is a highly attractive algorithm among many machine learning models due to its generalization power and classification performance based on sound mathematical formulation being convex that offers global minimum. However, despite being sparse, its high classification cost from kernel execution with Support Vectors (SVs) reduces the user's interest when there are hard computational constraints in the application, especially, for large and difficult data. So far in our knowledge, out of many existing works to overcome this problem, some are really interesting and heavy but get less attractive due to improper training difficulties for example, excessive cost-memory requirement, initialization, and parameter selection trouble because of the non-convexity of the problems while the other few that avoid these problems, cannot generate sparsity and complexity simultaneously of the final discriminator upto satisfactory level for very large and tricky data. In this direction, we propose a novel algorithm Efficiency Escalated SVM (EESVM) that solves two convex problems using Quadratic Programming (QP) and Linear Programming (LP) in sequence. This is followed by computational analysis on the remaining smallest set of slack variables that ultimately build two very essential properties of the machine: (i) Highly efficient by being heavily sparse and optimally complex and (ii) Able to handle very large and noise-effected complicated data. Benchmarking shows that this EESVM demands kernel computation as little as 6.8% of the standard QPSVM while posing almost the same classification accuracy on test data and requiring 42.7, 27.7 and 46.6% that of other three implemented state-of-the-art heavy-sparse machines while offering similar classification accuracy. It claims the lowest Machine Accuracy Cost (MAC) value among all of these machines though showing very similar generalization performance that is evaluated numerically using the term Generalization Failure Rate (GFR). Being quite pragmatic for modern technological advancement, it is indispensable for optimum manipulation of the troublesome massive, and difficult data.\",\"PeriodicalId\":40005,\"journal\":{\"name\":\"Journal of Computer Science\",\"volume\":\"125 1\",\"pages\":\"0\"},\"PeriodicalIF\":0.0000,\"publicationDate\":\"2023-10-01\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"0\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"Journal of Computer Science\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/10.3844/jcssp.2023.1253.1262\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"Journal of Computer Science","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.3844/jcssp.2023.1253.1262","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}

引用次数: 0

摘要

支持向量机(SVM)是众多机器学习模型中极具吸引力的一种算法，因为它的泛化能力和基于可靠的数学公式是凸的且提供全局最小值的分类性能。然而，尽管是稀疏的，但当应用程序中存在硬计算约束时，特别是对于大型和困难的数据时，使用支持向量(SVs)执行内核的高分类成本降低了用户的兴趣。到目前为止，在我们所知的许多克服这个问题的现有工作中，有些非常有趣和繁重，但由于不适当的训练困难而变得不那么吸引人，例如，过高的成本-内存需求，初始化和参数选择麻烦，因为问题的非凸性，而其他少数避免了这些问题。对于非常庞大和复杂的数据，不能同时生成令人满意的最终鉴别器的稀疏性和复杂性。在这个方向上，我们提出了一种新的算法效率升级支持向量机(EESVM)，该算法使用二次规划(QP)和线性规划(LP)依次解决两个凸问题。接下来是对剩余最小松弛变量集的计算分析，最终构建机器的两个非常重要的属性:(i)通过高度稀疏和最佳复杂而高效;(ii)能够处理非常大且受噪声影响的复杂数据。基准测试表明，该EESVM所需的内核计算量仅为标准QPSVM的6.8%，而对测试数据的分类精度几乎相同，在提供相似分类精度的情况下，该EESVM的分类精度分别为其他三种实现的最先进的重稀疏机器的42.7、27.7和46.6%。它声称在所有这些机器中最低的机器精度成本(MAC)值，尽管显示非常相似的泛化性能，使用术语泛化故障率(GFR)进行数值评估。对于现代技术进步来说，它是非常实用的，对于麻烦的海量、困难的数据进行优化处理是必不可少的。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

查看原文本刊更多论文

Escalating SVM-Efficiency by Sequential QP LP and Slack Variable Analysis

Support Vector Machine (SVM) is a highly attractive algorithm among many machine learning models due to its generalization power and classification performance based on sound mathematical formulation being convex that offers global minimum. However, despite being sparse, its high classification cost from kernel execution with Support Vectors (SVs) reduces the user's interest when there are hard computational constraints in the application, especially, for large and difficult data. So far in our knowledge, out of many existing works to overcome this problem, some are really interesting and heavy but get less attractive due to improper training difficulties for example, excessive cost-memory requirement, initialization, and parameter selection trouble because of the non-convexity of the problems while the other few that avoid these problems, cannot generate sparsity and complexity simultaneously of the final discriminator upto satisfactory level for very large and tricky data. In this direction, we propose a novel algorithm Efficiency Escalated SVM (EESVM) that solves two convex problems using Quadratic Programming (QP) and Linear Programming (LP) in sequence. This is followed by computational analysis on the remaining smallest set of slack variables that ultimately build two very essential properties of the machine: (i) Highly efficient by being heavily sparse and optimally complex and (ii) Able to handle very large and noise-effected complicated data. Benchmarking shows that this EESVM demands kernel computation as little as 6.8% of the standard QPSVM while posing almost the same classification accuracy on test data and requiring 42.7, 27.7 and 46.6% that of other three implemented state-of-the-art heavy-sparse machines while offering similar classification accuracy. It claims the lowest Machine Accuracy Cost (MAC) value among all of these machines though showing very similar generalization performance that is evaluated numerically using the term Generalization Failure Rate (GFR). Being quite pragmatic for modern technological advancement, it is indispensable for optimum manipulation of the troublesome massive, and difficult data.

求助全文

通过发布文献求助，成功后即可免费获取论文全文。去求助

来源期刊

Journal of Computer Science Computer Science-Computer Networks and Communications

CiteScore

1.70

自引率

0.00%

发文量

期刊介绍： Journal of Computer Science is aimed to publish research articles on theoretical foundations of information and computation, and of practical techniques for their implementation and application in computer systems. JCS updated twelve times a year and is a peer reviewed journal covers the latest and most compelling research of the time.