DeepCKID: A Multi-Head Attention-Based Deep Neural Network Model Leveraging Classwise Knowledge to Handle Imbalanced Textual Data

Amit Kumar Sah , Muhammad Abulaish
{"title":"DeepCKID: A Multi-Head Attention-Based Deep Neural Network Model Leveraging Classwise Knowledge to Handle Imbalanced Textual Data","authors":"Amit Kumar Sah ,&nbsp;Muhammad Abulaish","doi":"10.1016/j.mlwa.2024.100575","DOIUrl":null,"url":null,"abstract":"<div><p>This paper presents DeepCKID, a Multi-Head Attention (MHA)-based deep learning model that exploits statistical and semantic knowledge corresponding to documents across different classes in the datasets to improve the model’s ability to detect minority class instances in imbalanced text classification. In this process, corresponding to each document, DeepCKID extracts — (i) word-level statistical and semantic knowledge, namely, class correlation and class similarity corresponding to each word, based on its association with different classes in the dataset, and (ii) class-level knowledge from the document using <span><math><mi>n</mi></math></span>-grams and relation triplets corresponding to classwise keywords present, identified using cosine similarity utilizing Transformers-based Pre-trained Language Models (PLMs). DeepCKID encodes the word-level and class-level features using deep convolutional networks, which can learn meaningful patterns from them. At first, DeepCKID combines the semantically meaningful Sentence-BERT document embeddings and word-level feature matrix to give the final document representation, which it further fuses to the different classwise encoded representations to strengthen feature propagation. DeepCKID then passes the encoded document representation and its different classwise representations through an MHA layer to identify the important features at different positions of the feature subspaces, resulting in a latent dense vector accentuating its association with a particular class. Finally, DeepCKID passes the latent vector to the softmax layer to learn the corresponding class label. We evaluate DeepCKID over six publicly available Amazon reviews datasets using four Transformers-based PLMs. We compare DeepCKID with three approaches and four ablation-like baselines. Our study suggests that in most cases, DeepCKID outperforms all the comparison approaches, including baselines.</p></div>","PeriodicalId":74093,"journal":{"name":"Machine learning with applications","volume":"17 ","pages":"Article 100575"},"PeriodicalIF":0.0000,"publicationDate":"2024-07-14","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.sciencedirect.com/science/article/pii/S2666827024000513/pdfft?md5=8efb9f85f258bdd00899e0b78ef5e189&pid=1-s2.0-S2666827024000513-main.pdf","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Machine learning with applications","FirstCategoryId":"1085","ListUrlMain":"https://www.sciencedirect.com/science/article/pii/S2666827024000513","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 0

Abstract

This paper presents DeepCKID, a Multi-Head Attention (MHA)-based deep learning model that exploits statistical and semantic knowledge corresponding to documents across different classes in the datasets to improve the model’s ability to detect minority class instances in imbalanced text classification. In this process, corresponding to each document, DeepCKID extracts — (i) word-level statistical and semantic knowledge, namely, class correlation and class similarity corresponding to each word, based on its association with different classes in the dataset, and (ii) class-level knowledge from the document using n-grams and relation triplets corresponding to classwise keywords present, identified using cosine similarity utilizing Transformers-based Pre-trained Language Models (PLMs). DeepCKID encodes the word-level and class-level features using deep convolutional networks, which can learn meaningful patterns from them. At first, DeepCKID combines the semantically meaningful Sentence-BERT document embeddings and word-level feature matrix to give the final document representation, which it further fuses to the different classwise encoded representations to strengthen feature propagation. DeepCKID then passes the encoded document representation and its different classwise representations through an MHA layer to identify the important features at different positions of the feature subspaces, resulting in a latent dense vector accentuating its association with a particular class. Finally, DeepCKID passes the latent vector to the softmax layer to learn the corresponding class label. We evaluate DeepCKID over six publicly available Amazon reviews datasets using four Transformers-based PLMs. We compare DeepCKID with three approaches and four ablation-like baselines. Our study suggests that in most cases, DeepCKID outperforms all the comparison approaches, including baselines.

DeepCKID:利用分类知识处理不平衡文本数据的基于多头注意力的深度神经网络模型
本文介绍了基于多头注意力(MHA)的深度学习模型 DeepCKID,该模型利用数据集中不同类别文档对应的统计和语义知识,提高模型在不平衡文本分类中检测少数类别实例的能力。在此过程中,DeepCKID 会针对每篇文档提取:(i) 词语级统计和语义知识,即根据每个词与数据集中不同类别的关联度,提取与之对应的类别相关性和类别相似性;(ii) 类别级知识,即利用基于变换器的预训练语言模型(PLMs),使用余弦相似性识别文档中与存在的类别关键字相对应的 n-grams 和关系三元组。DeepCKID 利用深度卷积网络对词级和类级特征进行编码,并从中学习有意义的模式。首先,DeepCKID 将具有语义意义的 Sentence-BERT 文档嵌入和词级特征矩阵结合起来,给出最终的文档表示,并进一步将其融合到不同的类级编码表示中,以加强特征传播。然后,DeepCKID 将编码后的文档表示及其不同的分类表示通过一个 MHA 层,以识别特征子空间不同位置的重要特征,从而产生一个强调与特定类别关联的潜在密集向量。最后,DeepCKID 将潜向量传递给 softmax 层,以学习相应的类标签。我们使用四个基于 Transformers 的 PLM,对六个公开的亚马逊评论数据集进行了 DeepCKID 评估。我们将 DeepCKID 与三种方法和四种类似消融的基线进行了比较。我们的研究表明,在大多数情况下,DeepCKID 优于包括基线在内的所有比较方法。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
求助全文
约1分钟内获得全文 求助全文
来源期刊
Machine learning with applications
Machine learning with applications Management Science and Operations Research, Artificial Intelligence, Computer Science Applications
自引率
0.00%
发文量
0
审稿时长
98 days
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
确定
请完成安全验证×
copy
已复制链接
快去分享给好友吧!
我知道了
右上角分享
点击右上角分享
0
联系我们:info@booksci.cn Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。 Copyright © 2023 布克学术 All rights reserved.
京ICP备2023020795号-1
ghs 京公网安备 11010802042870号
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术官方微信