考虑类分布模型的大型数据集SVM分类

Jair Cervantes, Xiaoou Li, Wen Yu
{"title":"考虑类分布模型的大型数据集SVM分类","authors":"Jair Cervantes, Xiaoou Li, Wen Yu","doi":"10.1109/MICAI.2007.27","DOIUrl":null,"url":null,"abstract":"Despite of good theoretic foundations and high classification accuracy of support vector machines (SVM), normal SVM is not suitable for classification of large data sets, because the training complexity of SVM is very high. This paper presents a novel SVM classification approach for large data sets by considering models of classes distribution (MCD). A first stage uses SVM classification in order to gets a sketch of classes distribution. Then the algorithm obtain the support vectors (SVs) most close between each class and construct a ball using minimum enclosing ball from each pair of SVs with different label. The data points included in the balls constitute the MCD, which is the framework in the boundary of each class and represents the most important data points, these data points are used as training data for a posterior SVM classification. Experimental results show that our approach has good classification accuracy while the training is significantly faster than other SVM classifiers.","PeriodicalId":296192,"journal":{"name":"2007 Sixth Mexican International Conference on Artificial Intelligence, Special Session (MICAI)","volume":"74 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2007-11-04","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"35","resultStr":"{\"title\":\"SVM Classification for Large Data Sets by Considering Models of Classes Distribution\",\"authors\":\"Jair Cervantes, Xiaoou Li, Wen Yu\",\"doi\":\"10.1109/MICAI.2007.27\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"Despite of good theoretic foundations and high classification accuracy of support vector machines (SVM), normal SVM is not suitable for classification of large data sets, because the training complexity of SVM is very high. This paper presents a novel SVM classification approach for large data sets by considering models of classes distribution (MCD). A first stage uses SVM classification in order to gets a sketch of classes distribution. Then the algorithm obtain the support vectors (SVs) most close between each class and construct a ball using minimum enclosing ball from each pair of SVs with different label. The data points included in the balls constitute the MCD, which is the framework in the boundary of each class and represents the most important data points, these data points are used as training data for a posterior SVM classification. Experimental results show that our approach has good classification accuracy while the training is significantly faster than other SVM classifiers.\",\"PeriodicalId\":296192,\"journal\":{\"name\":\"2007 Sixth Mexican International Conference on Artificial Intelligence, Special Session (MICAI)\",\"volume\":\"74 1\",\"pages\":\"0\"},\"PeriodicalIF\":0.0000,\"publicationDate\":\"2007-11-04\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"35\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"2007 Sixth Mexican International Conference on Artificial Intelligence, Special Session (MICAI)\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/10.1109/MICAI.2007.27\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"2007 Sixth Mexican International Conference on Artificial Intelligence, Special Session (MICAI)","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/MICAI.2007.27","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 35

摘要

尽管支持向量机(SVM)具有良好的理论基础和较高的分类精度,但由于支持向量机的训练复杂度很高,普通支持向量机并不适合于大型数据集的分类。提出了一种基于类分布模型的支持向量机大数据集分类方法。第一阶段使用支持向量机分类得到类分布的草图。然后,从每一对不同标号的支持向量中选取最接近的支持向量(SVs),利用最小围球构造一个球。小球中包含的数据点构成MCD, MCD是每个类边界的框架,代表了最重要的数据点,这些数据点作为后验SVM分类的训练数据。实验结果表明,该方法具有良好的分类精度,训练速度明显快于其他SVM分类器。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
SVM Classification for Large Data Sets by Considering Models of Classes Distribution
Despite of good theoretic foundations and high classification accuracy of support vector machines (SVM), normal SVM is not suitable for classification of large data sets, because the training complexity of SVM is very high. This paper presents a novel SVM classification approach for large data sets by considering models of classes distribution (MCD). A first stage uses SVM classification in order to gets a sketch of classes distribution. Then the algorithm obtain the support vectors (SVs) most close between each class and construct a ball using minimum enclosing ball from each pair of SVs with different label. The data points included in the balls constitute the MCD, which is the framework in the boundary of each class and represents the most important data points, these data points are used as training data for a posterior SVM classification. Experimental results show that our approach has good classification accuracy while the training is significantly faster than other SVM classifiers.
求助全文
通过发布文献求助,成功后即可免费获取论文全文。 去求助
来源期刊
自引率
0.00%
发文量
0
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
确定
请完成安全验证×
copy
已复制链接
快去分享给好友吧!
我知道了
右上角分享
点击右上角分享
0
联系我们:info@booksci.cn Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。 Copyright © 2023 布克学术 All rights reserved.
京ICP备2023020795号-1
ghs 京公网安备 11010802042870号
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术官方微信