A multi-stage multi-objective GWO based feature selection approach for multi-label text classification

Pradip Dhal, Chandrashekhar Azad
{"title":"A multi-stage multi-objective GWO based feature selection approach for multi-label text classification","authors":"Pradip Dhal, Chandrashekhar Azad","doi":"10.1109/CONIT55038.2022.9847886","DOIUrl":null,"url":null,"abstract":"In Information Retrieval (IR), Text Mining (TM), and web search, Multi-label Text Classification (MTC) plays an essential role. A document can fall into more than one category in MTC. Text documents frequently include High Dimensional (HD) non-discriminative (noisy and irrelevant) phrases, resulting in high computing costs and impoverish learning performance of Text Classification (TC). The Feature Selection (FS) procedure is complicated by three issues caused by small samples and HD datasets. First, given limited samples and HD, FS is unstable. Second, with HD, FS takes longer. Third, a particular FS approach may not provide enough Classification Accuracy (CA). In this paper, we have developed a two-stage FS approach based Meta-heuristics Algorithm (MA) for MTC. The first stage work on the filter-based FS approach, while the second stage is based on the multi-objective Grey Wolf Optimization (GWO) algorithm. The first objective is to diminish the Hamming Loss (HL), and the second objective is to decrease the Selected Features (SF). We have used the Multi-Layer Perceptron (MLP) model for the classification task. The experimental findings show that the suggested FS scheme achieves superior HL with a less number of features.","PeriodicalId":270445,"journal":{"name":"2022 2nd International Conference on Intelligent Technologies (CONIT)","volume":"7 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2022-06-24","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"1","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"2022 2nd International Conference on Intelligent Technologies (CONIT)","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/CONIT55038.2022.9847886","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 1

Abstract

In Information Retrieval (IR), Text Mining (TM), and web search, Multi-label Text Classification (MTC) plays an essential role. A document can fall into more than one category in MTC. Text documents frequently include High Dimensional (HD) non-discriminative (noisy and irrelevant) phrases, resulting in high computing costs and impoverish learning performance of Text Classification (TC). The Feature Selection (FS) procedure is complicated by three issues caused by small samples and HD datasets. First, given limited samples and HD, FS is unstable. Second, with HD, FS takes longer. Third, a particular FS approach may not provide enough Classification Accuracy (CA). In this paper, we have developed a two-stage FS approach based Meta-heuristics Algorithm (MA) for MTC. The first stage work on the filter-based FS approach, while the second stage is based on the multi-objective Grey Wolf Optimization (GWO) algorithm. The first objective is to diminish the Hamming Loss (HL), and the second objective is to decrease the Selected Features (SF). We have used the Multi-Layer Perceptron (MLP) model for the classification task. The experimental findings show that the suggested FS scheme achieves superior HL with a less number of features.
一种基于多阶段多目标GWO的多标签文本分类特征选择方法
在信息检索(IR)、文本挖掘(TM)和web搜索中,多标签文本分类(MTC)起着至关重要的作用。在MTC中,一个文档可以属于多个类别。文本文档中经常包含高维(HD)非判别(有噪声和不相关)短语,导致文本分类(TC)的计算成本高,学习性能差。特征选择(FS)过程由于小样本和高清数据集导致的三个问题而复杂化。首先,考虑到有限的样本和HD, FS是不稳定的。其次,使用HD, FS需要更长的时间。第三,特定的FS方法可能无法提供足够的分类精度(CA)。在本文中,我们开发了一个基于两阶段FS方法的MTC元启发式算法(MA)。第一阶段采用基于滤波器的FS方法,第二阶段采用多目标灰狼优化算法。第一个目标是减少Hamming Loss (HL),第二个目标是减少Selected Features (SF)。我们使用多层感知器(MLP)模型进行分类任务。实验结果表明,所提出的FS方案以较少的特征数达到了较好的HL。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
求助全文
约1分钟内获得全文 求助全文
来源期刊
自引率
0.00%
发文量
0
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
确定
请完成安全验证×
copy
已复制链接
快去分享给好友吧!
我知道了
右上角分享
点击右上角分享
0
联系我们:info@booksci.cn Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。 Copyright © 2023 布克学术 All rights reserved.
京ICP备2023020795号-1
ghs 京公网安备 11010802042870号
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术官方微信