Carlos Aníbal Suárez,Mauricio Castro,Mariuxi Leon,Carlos Martin-Barreiro,Michael Liut
{"title":"Improving SVM performance through data reduction and misclassification analysis with linear programming.","authors":"Carlos Aníbal Suárez,Mauricio Castro,Mariuxi Leon,Carlos Martin-Barreiro,Michael Liut","doi":"10.1007/s40747-025-01989-4","DOIUrl":null,"url":null,"abstract":"In the dual optimization problem behind Support Vector Machine (SVM), each data point corresponds to a decision variable. Therefore, removing data points is equivalent to reducing the dimensionality of the dual problem, leading to a more efficient optimization process. We introduce linear programming models to determine whether two sets of points are linearly separable efficiently, compute the misclassification rate, and reduce the dimension of the optimization problems behind the SVM procedure. Data reduction can be conducted using a simple convexity property for the linearly separable case. The misclassification rate is a key indicator of the complexity of separating the two sets, providing valuable insights into the classification performance. Our approach combines SVM optimization with linear programming techniques to offer a comprehensive classification and complexity analysis framework.","PeriodicalId":10524,"journal":{"name":"Complex & Intelligent Systems","volume":"16 1","pages":"356"},"PeriodicalIF":5.0000,"publicationDate":"2025-06-21","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Complex & Intelligent Systems","FirstCategoryId":"94","ListUrlMain":"https://doi.org/10.1007/s40747-025-01989-4","RegionNum":2,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q1","JCRName":"COMPUTER SCIENCE, ARTIFICIAL INTELLIGENCE","Score":null,"Total":0}
引用次数: 0
Abstract
In the dual optimization problem behind Support Vector Machine (SVM), each data point corresponds to a decision variable. Therefore, removing data points is equivalent to reducing the dimensionality of the dual problem, leading to a more efficient optimization process. We introduce linear programming models to determine whether two sets of points are linearly separable efficiently, compute the misclassification rate, and reduce the dimension of the optimization problems behind the SVM procedure. Data reduction can be conducted using a simple convexity property for the linearly separable case. The misclassification rate is a key indicator of the complexity of separating the two sets, providing valuable insights into the classification performance. Our approach combines SVM optimization with linear programming techniques to offer a comprehensive classification and complexity analysis framework.
期刊介绍:
Complex & Intelligent Systems aims to provide a forum for presenting and discussing novel approaches, tools and techniques meant for attaining a cross-fertilization between the broad fields of complex systems, computational simulation, and intelligent analytics and visualization. The transdisciplinary research that the journal focuses on will expand the boundaries of our understanding by investigating the principles and processes that underlie many of the most profound problems facing society today.