混合ACO和TOFA特征选择方法的文本分类

H. Alghamdi, Lilian Tang, S. Alshomrani
{"title":"混合ACO和TOFA特征选择方法的文本分类","authors":"H. Alghamdi, Lilian Tang, S. Alshomrani","doi":"10.1109/CEC.2012.6252960","DOIUrl":null,"url":null,"abstract":"With the highly increasing availability of text data on the Internet, the process of selecting an appropriate set of features for text classification becomes more important, for not only reducing the dimensionality of the feature space, but also for improving the classification performance. This paper proposes a novel feature selection approach to improve the performance of text classifier based on an integration of Ant Colony Optimization algorithm (ACO) and Trace Oriented Feature Analysis (TOFA). ACO is metaheuristic search algorithm derived by the study of foraging behavior of real ants, specifically the pheromone communication to find the shortest path to the food source. TOFA is a unified optimization framework developed to integrate and unify several state-of-the-art dimension reduction algorithms through optimization framework. It has been shown in previous research that ACO is one of the promising approaches for optimization and feature selection problems. TOFA is capable of dealing with large scale text data and can be applied to several text analysis applications such as text classification, clustering and retrieval. For classification performance yet effective, the proposed approach makes use of TOFA and classifier performance as heuristic information of ACO. The results on Reuters and Brown public datasets demonstrate the effectiveness of the proposed approach.","PeriodicalId":376837,"journal":{"name":"2012 IEEE Congress on Evolutionary Computation","volume":"1 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2012-06-10","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"11","resultStr":"{\"title\":\"Hybrid ACO and TOFA feature selection approach for text classification\",\"authors\":\"H. Alghamdi, Lilian Tang, S. Alshomrani\",\"doi\":\"10.1109/CEC.2012.6252960\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"With the highly increasing availability of text data on the Internet, the process of selecting an appropriate set of features for text classification becomes more important, for not only reducing the dimensionality of the feature space, but also for improving the classification performance. This paper proposes a novel feature selection approach to improve the performance of text classifier based on an integration of Ant Colony Optimization algorithm (ACO) and Trace Oriented Feature Analysis (TOFA). ACO is metaheuristic search algorithm derived by the study of foraging behavior of real ants, specifically the pheromone communication to find the shortest path to the food source. TOFA is a unified optimization framework developed to integrate and unify several state-of-the-art dimension reduction algorithms through optimization framework. It has been shown in previous research that ACO is one of the promising approaches for optimization and feature selection problems. TOFA is capable of dealing with large scale text data and can be applied to several text analysis applications such as text classification, clustering and retrieval. For classification performance yet effective, the proposed approach makes use of TOFA and classifier performance as heuristic information of ACO. The results on Reuters and Brown public datasets demonstrate the effectiveness of the proposed approach.\",\"PeriodicalId\":376837,\"journal\":{\"name\":\"2012 IEEE Congress on Evolutionary Computation\",\"volume\":\"1 1\",\"pages\":\"0\"},\"PeriodicalIF\":0.0000,\"publicationDate\":\"2012-06-10\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"11\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"2012 IEEE Congress on Evolutionary Computation\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/10.1109/CEC.2012.6252960\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"2012 IEEE Congress on Evolutionary Computation","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/CEC.2012.6252960","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 11

摘要

随着互联网上文本数据可用性的不断提高,选择合适的特征集进行文本分类的过程变得越来越重要,这不仅可以降低特征空间的维数,还可以提高分类性能。为了提高文本分类器的性能,提出了一种基于蚁群优化算法(ACO)和面向迹的特征分析(TOFA)相结合的特征选择方法。蚁群算法是通过研究真实蚂蚁的觅食行为,特别是通过信息素的交流来寻找到食物源的最短路径而衍生出来的一种元启发式搜索算法。TOFA是一个统一的优化框架,通过优化框架集成和统一几种最先进的降维算法。已有研究表明,蚁群算法是一种很有前途的优化和特征选择方法。TOFA能够处理大规模文本数据,可以应用于文本分类、聚类和检索等多种文本分析应用。针对分类性能不高的问题,本文提出的方法利用TOFA和分类器性能作为蚁群算法的启发式信息。路透社和布朗公共数据集的结果证明了所提出方法的有效性。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
Hybrid ACO and TOFA feature selection approach for text classification
With the highly increasing availability of text data on the Internet, the process of selecting an appropriate set of features for text classification becomes more important, for not only reducing the dimensionality of the feature space, but also for improving the classification performance. This paper proposes a novel feature selection approach to improve the performance of text classifier based on an integration of Ant Colony Optimization algorithm (ACO) and Trace Oriented Feature Analysis (TOFA). ACO is metaheuristic search algorithm derived by the study of foraging behavior of real ants, specifically the pheromone communication to find the shortest path to the food source. TOFA is a unified optimization framework developed to integrate and unify several state-of-the-art dimension reduction algorithms through optimization framework. It has been shown in previous research that ACO is one of the promising approaches for optimization and feature selection problems. TOFA is capable of dealing with large scale text data and can be applied to several text analysis applications such as text classification, clustering and retrieval. For classification performance yet effective, the proposed approach makes use of TOFA and classifier performance as heuristic information of ACO. The results on Reuters and Brown public datasets demonstrate the effectiveness of the proposed approach.
求助全文
通过发布文献求助,成功后即可免费获取论文全文。 去求助
来源期刊
自引率
0.00%
发文量
0
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
确定
请完成安全验证×
copy
已复制链接
快去分享给好友吧!
我知道了
右上角分享
点击右上角分享
0
联系我们:info@booksci.cn Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。 Copyright © 2023 布克学术 All rights reserved.
京ICP备2023020795号-1
ghs 京公网安备 11010802042870号
Book学术文献互助
Book学术文献互助群
群 号:604180095
Book学术官方微信