{"title":"探索时间高效支持向量机分类器的数据约简技术","authors":"R. Rastogi, H. Safdari, Sweta Sharma","doi":"10.1109/SSCI.2018.8628716","DOIUrl":null,"url":null,"abstract":"Support Vector Machines [1] (SVMs) are regarded as powerful machine learning tool because of their inherent properties. However, one major challenge for using SVMs in real-world applications with large datasets is its high training time complexity. Over the years, many variants of SVM have been proposed to reduce the training time by either using algorithmic modifications (such as LS-SVM [3], GEP-SVM [4], TWSVM [5]) or training level speed-ups (such as SMO [6], SOR [2] and Stochastic Gradient Descent method [7]). However, these methods deal with the entire data for learning a classifier model, thus the space complexity could be a challenge. A more fitting approach is to use an Instance Selection method (IS) which selects a subset of data which is best representative of the underlying data distribution. Since SVMs by definition use the geometry of patterns for classification, this study explores the effects of different Instance Selection methods on different variants of SVM to check their effectiveness using their comparative performances in terms of training time and generalization ability. Various theoretical and experimental comparisons on standard datasets have been provided to validate the efficacy of different IS methods on SVM based classifiers.","PeriodicalId":235735,"journal":{"name":"2018 IEEE Symposium Series on Computational Intelligence (SSCI)","volume":"1 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2018-11-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"1","resultStr":"{\"title\":\"Exploring Data Reduction Techniques for Time Efficient Support Vector Machine Classifiers\",\"authors\":\"R. Rastogi, H. Safdari, Sweta Sharma\",\"doi\":\"10.1109/SSCI.2018.8628716\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"Support Vector Machines [1] (SVMs) are regarded as powerful machine learning tool because of their inherent properties. However, one major challenge for using SVMs in real-world applications with large datasets is its high training time complexity. Over the years, many variants of SVM have been proposed to reduce the training time by either using algorithmic modifications (such as LS-SVM [3], GEP-SVM [4], TWSVM [5]) or training level speed-ups (such as SMO [6], SOR [2] and Stochastic Gradient Descent method [7]). However, these methods deal with the entire data for learning a classifier model, thus the space complexity could be a challenge. A more fitting approach is to use an Instance Selection method (IS) which selects a subset of data which is best representative of the underlying data distribution. Since SVMs by definition use the geometry of patterns for classification, this study explores the effects of different Instance Selection methods on different variants of SVM to check their effectiveness using their comparative performances in terms of training time and generalization ability. Various theoretical and experimental comparisons on standard datasets have been provided to validate the efficacy of different IS methods on SVM based classifiers.\",\"PeriodicalId\":235735,\"journal\":{\"name\":\"2018 IEEE Symposium Series on Computational Intelligence (SSCI)\",\"volume\":\"1 1\",\"pages\":\"0\"},\"PeriodicalIF\":0.0000,\"publicationDate\":\"2018-11-01\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"1\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"2018 IEEE Symposium Series on Computational Intelligence (SSCI)\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/10.1109/SSCI.2018.8628716\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"2018 IEEE Symposium Series on Computational Intelligence (SSCI)","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/SSCI.2018.8628716","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
Exploring Data Reduction Techniques for Time Efficient Support Vector Machine Classifiers
Support Vector Machines [1] (SVMs) are regarded as powerful machine learning tool because of their inherent properties. However, one major challenge for using SVMs in real-world applications with large datasets is its high training time complexity. Over the years, many variants of SVM have been proposed to reduce the training time by either using algorithmic modifications (such as LS-SVM [3], GEP-SVM [4], TWSVM [5]) or training level speed-ups (such as SMO [6], SOR [2] and Stochastic Gradient Descent method [7]). However, these methods deal with the entire data for learning a classifier model, thus the space complexity could be a challenge. A more fitting approach is to use an Instance Selection method (IS) which selects a subset of data which is best representative of the underlying data distribution. Since SVMs by definition use the geometry of patterns for classification, this study explores the effects of different Instance Selection methods on different variants of SVM to check their effectiveness using their comparative performances in terms of training time and generalization ability. Various theoretical and experimental comparisons on standard datasets have been provided to validate the efficacy of different IS methods on SVM based classifiers.