Hybrid Grasshopper and Chameleon Swarm Optimization Algorithm for Text Feature Selection with Density Peaks Clustering

R. Purushothaman, S. Selvakumar, S. Rajagopalan
{"title":"Hybrid Grasshopper and Chameleon Swarm Optimization Algorithm for Text Feature Selection with Density Peaks Clustering","authors":"R. Purushothaman, S. Selvakumar, S. Rajagopalan","doi":"10.1142/s1469026822500183","DOIUrl":null,"url":null,"abstract":"Clustering consists of various applications on machine learning, image segmentation, data mining and pattern recognition. The proper selection of clustering is significant in feature selection. Therefore, in this paper, a Text Feature Selection (FS) and Clustering using Grasshopper–Chameleon Swarm Optimization with Density Peaks Clustering algorithm (TFSC-G-CSOA-DPCA) is proposed. Initially, the input features are pre-processed for converting text into numerical form. These preprocessed text features are given to Grasshopper–Chameleon Swarm Optimization Algorithm, which selects important text features. In Grasshopper–Chameleon Swarm Optimization Algorithm, the Grasshopper Optimization Algorithm selects local feature from text document and Chameleon Swarm Optimization Algorithm selects the best global feature from local feature. These important features are tested using density peaks clustering algorithm to maximize the reliability and minimize the computational time cost. The performance of Grasshopper–Chameleon Swarm Optimization Algorithm is analyzed with 20 News groups dataset. Moreover, the performance metrics, like accuracy, precision, sensitivity, specificity, execution time and memory usage are analyzed. The simulation process shows that the proposed TFSC-G-CSOA-DPCA method provides better accuracy of 97.36%, 95.14%, 94.67% and 91.91% and maximum sensitivity of 96.25%, 87.25%, 93.96% and 92.59% compared to the existing methods such as TFSC-BBA-MCL, TFSC-MVO-K-Means C, TFSC-GWO-GOA-FCM and TFSC-WM-K-Means C, respectively.","PeriodicalId":422521,"journal":{"name":"Int. J. Comput. Intell. Appl.","volume":"154 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2022-09-26","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Int. J. Comput. Intell. Appl.","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1142/s1469026822500183","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 0

Abstract

Clustering consists of various applications on machine learning, image segmentation, data mining and pattern recognition. The proper selection of clustering is significant in feature selection. Therefore, in this paper, a Text Feature Selection (FS) and Clustering using Grasshopper–Chameleon Swarm Optimization with Density Peaks Clustering algorithm (TFSC-G-CSOA-DPCA) is proposed. Initially, the input features are pre-processed for converting text into numerical form. These preprocessed text features are given to Grasshopper–Chameleon Swarm Optimization Algorithm, which selects important text features. In Grasshopper–Chameleon Swarm Optimization Algorithm, the Grasshopper Optimization Algorithm selects local feature from text document and Chameleon Swarm Optimization Algorithm selects the best global feature from local feature. These important features are tested using density peaks clustering algorithm to maximize the reliability and minimize the computational time cost. The performance of Grasshopper–Chameleon Swarm Optimization Algorithm is analyzed with 20 News groups dataset. Moreover, the performance metrics, like accuracy, precision, sensitivity, specificity, execution time and memory usage are analyzed. The simulation process shows that the proposed TFSC-G-CSOA-DPCA method provides better accuracy of 97.36%, 95.14%, 94.67% and 91.91% and maximum sensitivity of 96.25%, 87.25%, 93.96% and 92.59% compared to the existing methods such as TFSC-BBA-MCL, TFSC-MVO-K-Means C, TFSC-GWO-GOA-FCM and TFSC-WM-K-Means C, respectively.
基于密度峰聚类的混合蚱蜢和变色龙群优化文本特征选择算法
聚类包括机器学习、图像分割、数据挖掘和模式识别等多种应用。聚类的正确选择在特征选择中具有重要意义。为此,本文提出了一种基于Grasshopper-Chameleon Swarm Optimization with Density Peaks Clustering algorithm (TFSC-G-CSOA-DPCA)的文本特征选择(FS)和聚类算法。首先,对输入特征进行预处理,以便将文本转换为数字形式。将这些预处理后的文本特征输入到Grasshopper-Chameleon Swarm Optimization算法中,从中选择重要的文本特征。在Grasshopper - Chameleon Swarm Optimization Algorithm中,Grasshopper Optimization Algorithm从文本文档中选择局部特征,Chameleon Swarm Optimization Algorithm从局部特征中选择最优的全局特征。使用密度峰值聚类算法对这些重要特征进行测试,以最大限度地提高可靠性和最小化计算时间开销。用20个新闻组数据集分析了蝗虫-变色龙群优化算法的性能。此外,还分析了准确性、精密度、灵敏度、特异性、执行时间和内存使用等性能指标。仿真结果表明,与现有的TFSC-BBA-MCL、TFSC-MVO-K-Means C、TFSC-GWO-GOA-FCM和TFSC-WM-K-Means C方法相比,所提出的TFSC-G-CSOA-DPCA方法准确率分别为97.36%、95.14%、94.67%和91.91%,最大灵敏度分别为96.25%、87.25%、93.96%和92.59%。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
求助全文
约1分钟内获得全文 求助全文
来源期刊
自引率
0.00%
发文量
0
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
确定
请完成安全验证×
copy
已复制链接
快去分享给好友吧!
我知道了
右上角分享
点击右上角分享
0
联系我们:info@booksci.cn Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。 Copyright © 2023 布克学术 All rights reserved.
京ICP备2023020795号-1
ghs 京公网安备 11010802042870号
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术官方微信