Shun Su , Dangguo Shao , Lei Ma , Sanli Yi , Ziwei Yang
{"title":"ADCL: An attention feature enhancement network based on adversarial contrastive learning for short text classification","authors":"Shun Su , Dangguo Shao , Lei Ma , Sanli Yi , Ziwei Yang","doi":"10.1016/j.aei.2025.103202","DOIUrl":null,"url":null,"abstract":"<div><div>Supervised Contrastive Learning (SCL) has emerged as a powerful approach for improving model performance in text classification tasks, particularly in few-shot learning scenarios. However, existing SCL methods predominantly focus on the contrastive relationships between positive and negative samples, often neglecting the intrinsic semantic features of individual samples. This limitation can introduce training biases, especially when labeled data are scarce. Additionally, the intrinsic feature sparsity of short texts further aggravates this issue, hindering the extraction of discriminative and robust representations. To address these challenges, we propose a Label-aware Attention-based Adversarial Contrastive Learning Network (ADCL). The model incorporates a bidirectional contrastive learning framework that leverages cross-attention layers to enhance interactions between label and document representations. Moreover, adversarial learning is employed to optimize the backpropagation of contrastive learning gradients, effectively decoupling sample embeddings from label-specific features. Compared to prior methods, ADCL not only emphasizes contrasts between positive and negative samples but also prioritizes the intrinsic semantic information of individual samples during the learning process. We conduct comprehensive experiments from both full-shot and few-shot learning perspectives on five benchmark short-text datasets: SST-2, SUBJ, TREC, PC, and CR. The results demonstrate that ADCL consistently outperforms existing contrastive learning methods, achieving superior average accuracy across the majority of tasks.</div></div>","PeriodicalId":50941,"journal":{"name":"Advanced Engineering Informatics","volume":"65 ","pages":"Article 103202"},"PeriodicalIF":8.0000,"publicationDate":"2025-02-19","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Advanced Engineering Informatics","FirstCategoryId":"5","ListUrlMain":"https://www.sciencedirect.com/science/article/pii/S1474034625000953","RegionNum":1,"RegionCategory":"工程技术","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q1","JCRName":"COMPUTER SCIENCE, ARTIFICIAL INTELLIGENCE","Score":null,"Total":0}
ADCL: An attention feature enhancement network based on adversarial contrastive learning for short text classification
Supervised Contrastive Learning (SCL) has emerged as a powerful approach for improving model performance in text classification tasks, particularly in few-shot learning scenarios. However, existing SCL methods predominantly focus on the contrastive relationships between positive and negative samples, often neglecting the intrinsic semantic features of individual samples. This limitation can introduce training biases, especially when labeled data are scarce. Additionally, the intrinsic feature sparsity of short texts further aggravates this issue, hindering the extraction of discriminative and robust representations. To address these challenges, we propose a Label-aware Attention-based Adversarial Contrastive Learning Network (ADCL). The model incorporates a bidirectional contrastive learning framework that leverages cross-attention layers to enhance interactions between label and document representations. Moreover, adversarial learning is employed to optimize the backpropagation of contrastive learning gradients, effectively decoupling sample embeddings from label-specific features. Compared to prior methods, ADCL not only emphasizes contrasts between positive and negative samples but also prioritizes the intrinsic semantic information of individual samples during the learning process. We conduct comprehensive experiments from both full-shot and few-shot learning perspectives on five benchmark short-text datasets: SST-2, SUBJ, TREC, PC, and CR. The results demonstrate that ADCL consistently outperforms existing contrastive learning methods, achieving superior average accuracy across the majority of tasks.
期刊介绍:
Advanced Engineering Informatics is an international Journal that solicits research papers with an emphasis on 'knowledge' and 'engineering applications'. The Journal seeks original papers that report progress in applying methods of engineering informatics. These papers should have engineering relevance and help provide a scientific base for more reliable, spontaneous, and creative engineering decision-making. Additionally, papers should demonstrate the science of supporting knowledge-intensive engineering tasks and validate the generality, power, and scalability of new methods through rigorous evaluation, preferably both qualitatively and quantitatively. Abstracting and indexing for Advanced Engineering Informatics include Science Citation Index Expanded, Scopus and INSPEC.