一种改进的LAM特征选择算法

Yong-gong Ren, Nan Lin, Yu-qi Sun
{"title":"一种改进的LAM特征选择算法","authors":"Yong-gong Ren, Nan Lin, Yu-qi Sun","doi":"10.1109/WISA.2010.33","DOIUrl":null,"url":null,"abstract":"In text categorization, feature selection is an effective feature dimension-reduction methods. To solve the problems of unadaptable high original feature space dimension, too much irrelevance, data redundancy and difficulties in selecting a threshold, we propose an improved LAM feature selection algorithm (ILAMFS). Firstly, combining the gold segmentation and the LAM algorithm based on the characteristics and the category of the correlation analysis, filtering the original feature set, and retaining the feature selection with strong correlation and weak category. Secondly, with the improved LAM algorithm, weighted average and Jaccard coefficient of such thoughts feature subsets make redundancy filtering out redundant features. Finally, we obtain an approximate optimal feature subset. Experimental results show that this method is effective in data dimension on reduction, threshold selection and furthermore, in reducing the computation amount and precision in the feature selection.","PeriodicalId":122827,"journal":{"name":"2010 Seventh Web Information Systems and Applications Conference","volume":"4 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2010-08-20","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"1","resultStr":"{\"title\":\"An Improved LAM Feature Selection Algorithm\",\"authors\":\"Yong-gong Ren, Nan Lin, Yu-qi Sun\",\"doi\":\"10.1109/WISA.2010.33\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"In text categorization, feature selection is an effective feature dimension-reduction methods. To solve the problems of unadaptable high original feature space dimension, too much irrelevance, data redundancy and difficulties in selecting a threshold, we propose an improved LAM feature selection algorithm (ILAMFS). Firstly, combining the gold segmentation and the LAM algorithm based on the characteristics and the category of the correlation analysis, filtering the original feature set, and retaining the feature selection with strong correlation and weak category. Secondly, with the improved LAM algorithm, weighted average and Jaccard coefficient of such thoughts feature subsets make redundancy filtering out redundant features. Finally, we obtain an approximate optimal feature subset. Experimental results show that this method is effective in data dimension on reduction, threshold selection and furthermore, in reducing the computation amount and precision in the feature selection.\",\"PeriodicalId\":122827,\"journal\":{\"name\":\"2010 Seventh Web Information Systems and Applications Conference\",\"volume\":\"4 1\",\"pages\":\"0\"},\"PeriodicalIF\":0.0000,\"publicationDate\":\"2010-08-20\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"1\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"2010 Seventh Web Information Systems and Applications Conference\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/10.1109/WISA.2010.33\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"2010 Seventh Web Information Systems and Applications Conference","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/WISA.2010.33","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 1

摘要

在文本分类中,特征选择是一种有效的特征降维方法。针对原始特征空间维度不适应、不相关性大、数据冗余以及阈值选择困难等问题,提出了一种改进的LAM特征选择算法(ILAMFS)。首先,将黄金分割与基于相关性分析的特征和类别的LAM算法相结合,对原始特征集进行过滤,保留强相关性和弱类别的特征选择;其次,利用改进的LAM算法,对思想特征子集进行加权平均和Jaccard系数,对冗余特征进行冗余过滤。最后,我们得到了一个近似的最优特征子集。实验结果表明,该方法在数据降维、阈值选择以及特征选择的计算量和精度方面都有较好的效果。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
An Improved LAM Feature Selection Algorithm
In text categorization, feature selection is an effective feature dimension-reduction methods. To solve the problems of unadaptable high original feature space dimension, too much irrelevance, data redundancy and difficulties in selecting a threshold, we propose an improved LAM feature selection algorithm (ILAMFS). Firstly, combining the gold segmentation and the LAM algorithm based on the characteristics and the category of the correlation analysis, filtering the original feature set, and retaining the feature selection with strong correlation and weak category. Secondly, with the improved LAM algorithm, weighted average and Jaccard coefficient of such thoughts feature subsets make redundancy filtering out redundant features. Finally, we obtain an approximate optimal feature subset. Experimental results show that this method is effective in data dimension on reduction, threshold selection and furthermore, in reducing the computation amount and precision in the feature selection.
求助全文
通过发布文献求助,成功后即可免费获取论文全文。 去求助
来源期刊
自引率
0.00%
发文量
0
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
确定
请完成安全验证×
copy
已复制链接
快去分享给好友吧!
我知道了
右上角分享
点击右上角分享
0
联系我们:info@booksci.cn Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。 Copyright © 2023 布克学术 All rights reserved.
京ICP备2023020795号-1
ghs 京公网安备 11010802042870号
Book学术文献互助
Book学术文献互助群
群 号:604180095
Book学术官方微信