{"title":"一种改进的LAM特征选择算法","authors":"Yong-gong Ren, Nan Lin, Yu-qi Sun","doi":"10.1109/WISA.2010.33","DOIUrl":null,"url":null,"abstract":"In text categorization, feature selection is an effective feature dimension-reduction methods. To solve the problems of unadaptable high original feature space dimension, too much irrelevance, data redundancy and difficulties in selecting a threshold, we propose an improved LAM feature selection algorithm (ILAMFS). Firstly, combining the gold segmentation and the LAM algorithm based on the characteristics and the category of the correlation analysis, filtering the original feature set, and retaining the feature selection with strong correlation and weak category. Secondly, with the improved LAM algorithm, weighted average and Jaccard coefficient of such thoughts feature subsets make redundancy filtering out redundant features. Finally, we obtain an approximate optimal feature subset. Experimental results show that this method is effective in data dimension on reduction, threshold selection and furthermore, in reducing the computation amount and precision in the feature selection.","PeriodicalId":122827,"journal":{"name":"2010 Seventh Web Information Systems and Applications Conference","volume":"4 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2010-08-20","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"1","resultStr":"{\"title\":\"An Improved LAM Feature Selection Algorithm\",\"authors\":\"Yong-gong Ren, Nan Lin, Yu-qi Sun\",\"doi\":\"10.1109/WISA.2010.33\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"In text categorization, feature selection is an effective feature dimension-reduction methods. To solve the problems of unadaptable high original feature space dimension, too much irrelevance, data redundancy and difficulties in selecting a threshold, we propose an improved LAM feature selection algorithm (ILAMFS). Firstly, combining the gold segmentation and the LAM algorithm based on the characteristics and the category of the correlation analysis, filtering the original feature set, and retaining the feature selection with strong correlation and weak category. Secondly, with the improved LAM algorithm, weighted average and Jaccard coefficient of such thoughts feature subsets make redundancy filtering out redundant features. Finally, we obtain an approximate optimal feature subset. Experimental results show that this method is effective in data dimension on reduction, threshold selection and furthermore, in reducing the computation amount and precision in the feature selection.\",\"PeriodicalId\":122827,\"journal\":{\"name\":\"2010 Seventh Web Information Systems and Applications Conference\",\"volume\":\"4 1\",\"pages\":\"0\"},\"PeriodicalIF\":0.0000,\"publicationDate\":\"2010-08-20\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"1\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"2010 Seventh Web Information Systems and Applications Conference\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/10.1109/WISA.2010.33\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"2010 Seventh Web Information Systems and Applications Conference","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/WISA.2010.33","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
In text categorization, feature selection is an effective feature dimension-reduction methods. To solve the problems of unadaptable high original feature space dimension, too much irrelevance, data redundancy and difficulties in selecting a threshold, we propose an improved LAM feature selection algorithm (ILAMFS). Firstly, combining the gold segmentation and the LAM algorithm based on the characteristics and the category of the correlation analysis, filtering the original feature set, and retaining the feature selection with strong correlation and weak category. Secondly, with the improved LAM algorithm, weighted average and Jaccard coefficient of such thoughts feature subsets make redundancy filtering out redundant features. Finally, we obtain an approximate optimal feature subset. Experimental results show that this method is effective in data dimension on reduction, threshold selection and furthermore, in reducing the computation amount and precision in the feature selection.