On the probability of feature selection in support vector classification

Qunfeng Liu, Lan Yao
{"title":"On the probability of feature selection in support vector classification","authors":"Qunfeng Liu, Lan Yao","doi":"10.1109/SOLI.2013.6611436","DOIUrl":null,"url":null,"abstract":"Feature selection is important for classification problem, especially when the number of features is very large or noisiness is present in data. Support vector machine (SVM) with Lp regularization is a popular approach for feature selection. Many researches have devoted to develop efficient methods to solve the optimization problem in support vector machine. However, to our knowledge, there is still no formal proof or comprehensive mathematical understanding on how Lp regularization can bring feature selection. In this paper, we first show that feature selection depends not only the parameter p but also the data itself. If the feasible region generated from the data lies faraway relatively from the coordinates, then feature selection maybe impossible for any p. Otherwise, a small p can help to enhance the ability of feature selection of Lp-SVM. Then we provide a formula for computing the probabilities which measure the feature selection ability. The only assumption is that the optimal solutions of all possible classification problems distribute uniformly on the contour of the objective function. Based on this formula, we compute the probabilities for some popular p.","PeriodicalId":147180,"journal":{"name":"Proceedings of 2013 IEEE International Conference on Service Operations and Logistics, and Informatics","volume":"113 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2013-07-28","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Proceedings of 2013 IEEE International Conference on Service Operations and Logistics, and Informatics","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/SOLI.2013.6611436","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 0

Abstract

Feature selection is important for classification problem, especially when the number of features is very large or noisiness is present in data. Support vector machine (SVM) with Lp regularization is a popular approach for feature selection. Many researches have devoted to develop efficient methods to solve the optimization problem in support vector machine. However, to our knowledge, there is still no formal proof or comprehensive mathematical understanding on how Lp regularization can bring feature selection. In this paper, we first show that feature selection depends not only the parameter p but also the data itself. If the feasible region generated from the data lies faraway relatively from the coordinates, then feature selection maybe impossible for any p. Otherwise, a small p can help to enhance the ability of feature selection of Lp-SVM. Then we provide a formula for computing the probabilities which measure the feature selection ability. The only assumption is that the optimal solutions of all possible classification problems distribute uniformly on the contour of the objective function. Based on this formula, we compute the probabilities for some popular p.
支持向量分类中特征选择的概率问题
特征选择对于分类问题非常重要,特别是在特征数量非常大或数据中存在噪声的情况下。基于Lp正则化的支持向量机(SVM)是一种常用的特征选择方法。支持向量机优化问题的有效解决方法已经得到了广泛的研究。然而,据我们所知,对于Lp正则化如何带来特征选择,仍然没有正式的证明或全面的数学理解。在本文中,我们首先证明了特征选择不仅取决于参数p,而且取决于数据本身。如果数据生成的可行区域相对距离坐标较远,则任何p都可能无法进行特征选择。否则,较小的p有助于增强Lp-SVM的特征选择能力。在此基础上,给出了一个计算特征选择能力的概率公式。唯一的假设是所有可能的分类问题的最优解均匀地分布在目标函数的轮廓上。根据这个公式,我们计算一些流行p的概率。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
求助全文
约1分钟内获得全文 求助全文
来源期刊
自引率
0.00%
发文量
0
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
确定
请完成安全验证×
copy
已复制链接
快去分享给好友吧!
我知道了
右上角分享
点击右上角分享
0
联系我们:info@booksci.cn Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。 Copyright © 2023 布克学术 All rights reserved.
京ICP备2023020795号-1
ghs 京公网安备 11010802042870号
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术官方微信