{"title":"Tactics and Techniques Text Classification Based on Adversarial Contrastive Learning and Meta-Path","authors":"Yuchun Han;Weiping Wang;Zhe Qu;Shigeng Zhang","doi":"10.1109/TIFS.2025.3609218","DOIUrl":null,"url":null,"abstract":"Tactics and techniques information in Cyber Threat Intelligence (CTI) represent the objectives of attackers and the means through which these objectives are achieved. The classification of tactics and techniques descriptions in CTI has been extensively studied to assist security experts in interpreting attack patterns. Although many recent studies have applied various deep learning methods to enhance classification performance, they mainly focus on improving performance from an average or top perspective. However, the imbalance between tactical and technical tag samples, as well as text sparsity, may lead to poor model performance, which has been under-explored. To address these issues, we propose a new tactics and techniques classification model based on adversarial contrastive learning and meta-path (TTC-ACLM). In TTC-ACLM, a novel text representation learning module is first designed. It includes pre-trained language model (PLM) and contrastive adversarial methods, which can better adapt to categories with smaller sample sizes while obtaining better text representations. Then, heterogeneous information networks are used to model the rich relationships between texts and labels (tactics and techniques), which can merge additional information, e.g., processes and tools, to address text sparsity. Next, we defined a meta-path based classifier learning module that maps text, tactics, and meta-path based context to a set of classifiers, which are applied to the text representation generated by the text representation module for better classification. Finally, the classification performance is further improved through the tactics and techniques correlation enhancement matrix. Through in-depth research, we demonstrate that the proposed model can effectively address the impact of sample imbalance and text sparsity. Extensive experimental results indicate that TTC-ACLM achieves state-of-the-art performance.","PeriodicalId":13492,"journal":{"name":"IEEE Transactions on Information Forensics and Security","volume":"20 ","pages":"10098-10113"},"PeriodicalIF":8.0000,"publicationDate":"2025-09-11","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"IEEE Transactions on Information Forensics and Security","FirstCategoryId":"94","ListUrlMain":"https://ieeexplore.ieee.org/document/11159532/","RegionNum":1,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q1","JCRName":"COMPUTER SCIENCE, THEORY & METHODS","Score":null,"Total":0}
引用次数: 0
Abstract
Tactics and techniques information in Cyber Threat Intelligence (CTI) represent the objectives of attackers and the means through which these objectives are achieved. The classification of tactics and techniques descriptions in CTI has been extensively studied to assist security experts in interpreting attack patterns. Although many recent studies have applied various deep learning methods to enhance classification performance, they mainly focus on improving performance from an average or top perspective. However, the imbalance between tactical and technical tag samples, as well as text sparsity, may lead to poor model performance, which has been under-explored. To address these issues, we propose a new tactics and techniques classification model based on adversarial contrastive learning and meta-path (TTC-ACLM). In TTC-ACLM, a novel text representation learning module is first designed. It includes pre-trained language model (PLM) and contrastive adversarial methods, which can better adapt to categories with smaller sample sizes while obtaining better text representations. Then, heterogeneous information networks are used to model the rich relationships between texts and labels (tactics and techniques), which can merge additional information, e.g., processes and tools, to address text sparsity. Next, we defined a meta-path based classifier learning module that maps text, tactics, and meta-path based context to a set of classifiers, which are applied to the text representation generated by the text representation module for better classification. Finally, the classification performance is further improved through the tactics and techniques correlation enhancement matrix. Through in-depth research, we demonstrate that the proposed model can effectively address the impact of sample imbalance and text sparsity. Extensive experimental results indicate that TTC-ACLM achieves state-of-the-art performance.
期刊介绍:
The IEEE Transactions on Information Forensics and Security covers the sciences, technologies, and applications relating to information forensics, information security, biometrics, surveillance and systems applications that incorporate these features