Tactics and Techniques Text Classification Based on Adversarial Contrastive Learning and Meta-Path

IF 8 1区计算机科学 Q1 COMPUTER SCIENCE, THEORY & METHODS

IEEE Transactions on Information Forensics and Security Pub Date : 2025-09-11 DOI:10.1109/TIFS.2025.3609218

Yuchun Han;Weiping Wang;Zhe Qu;Shigeng Zhang

{"title":"Tactics and Techniques Text Classification Based on Adversarial Contrastive Learning and Meta-Path","authors":"Yuchun Han;Weiping Wang;Zhe Qu;Shigeng Zhang","doi":"10.1109/TIFS.2025.3609218","DOIUrl":null,"url":null,"abstract":"Tactics and techniques information in Cyber Threat Intelligence (CTI) represent the objectives of attackers and the means through which these objectives are achieved. The classification of tactics and techniques descriptions in CTI has been extensively studied to assist security experts in interpreting attack patterns. Although many recent studies have applied various deep learning methods to enhance classification performance, they mainly focus on improving performance from an average or top perspective. However, the imbalance between tactical and technical tag samples, as well as text sparsity, may lead to poor model performance, which has been under-explored. To address these issues, we propose a new tactics and techniques classification model based on adversarial contrastive learning and meta-path (TTC-ACLM). In TTC-ACLM, a novel text representation learning module is first designed. It includes pre-trained language model (PLM) and contrastive adversarial methods, which can better adapt to categories with smaller sample sizes while obtaining better text representations. Then, heterogeneous information networks are used to model the rich relationships between texts and labels (tactics and techniques), which can merge additional information, e.g., processes and tools, to address text sparsity. Next, we defined a meta-path based classifier learning module that maps text, tactics, and meta-path based context to a set of classifiers, which are applied to the text representation generated by the text representation module for better classification. Finally, the classification performance is further improved through the tactics and techniques correlation enhancement matrix. Through in-depth research, we demonstrate that the proposed model can effectively address the impact of sample imbalance and text sparsity. Extensive experimental results indicate that TTC-ACLM achieves state-of-the-art performance.","PeriodicalId":13492,"journal":{"name":"IEEE Transactions on Information Forensics and Security","volume":"20 ","pages":"10098-10113"},"PeriodicalIF":8.0000,"publicationDate":"2025-09-11","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"IEEE Transactions on Information Forensics and Security","FirstCategoryId":"94","ListUrlMain":"https://ieeexplore.ieee.org/document/11159532/","RegionNum":1,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q1","JCRName":"COMPUTER SCIENCE, THEORY & METHODS","Score":null,"Total":0}

引用次数: 0

Abstract

Tactics and techniques information in Cyber Threat Intelligence (CTI) represent the objectives of attackers and the means through which these objectives are achieved. The classification of tactics and techniques descriptions in CTI has been extensively studied to assist security experts in interpreting attack patterns. Although many recent studies have applied various deep learning methods to enhance classification performance, they mainly focus on improving performance from an average or top perspective. However, the imbalance between tactical and technical tag samples, as well as text sparsity, may lead to poor model performance, which has been under-explored. To address these issues, we propose a new tactics and techniques classification model based on adversarial contrastive learning and meta-path (TTC-ACLM). In TTC-ACLM, a novel text representation learning module is first designed. It includes pre-trained language model (PLM) and contrastive adversarial methods, which can better adapt to categories with smaller sample sizes while obtaining better text representations. Then, heterogeneous information networks are used to model the rich relationships between texts and labels (tactics and techniques), which can merge additional information, e.g., processes and tools, to address text sparsity. Next, we defined a meta-path based classifier learning module that maps text, tactics, and meta-path based context to a set of classifiers, which are applied to the text representation generated by the text representation module for better classification. Finally, the classification performance is further improved through the tactics and techniques correlation enhancement matrix. Through in-depth research, we demonstrate that the proposed model can effectively address the impact of sample imbalance and text sparsity. Extensive experimental results indicate that TTC-ACLM achieves state-of-the-art performance.

查看原文本刊更多论文

基于对抗性对比学习和元路径的文本分类策略与技术

网络威胁情报（CTI）中的战术和技术信息代表了攻击者的目标以及实现这些目标的手段。CTI中的战术分类和技术描述已被广泛研究，以帮助安全专家解释攻击模式。虽然近年来的许多研究应用了各种深度学习方法来提高分类性能，但它们主要是从平均或顶级的角度来提高性能。然而，战术和技术标签样本之间的不平衡以及文本稀疏性可能导致模型性能不佳，这一点尚未得到充分探讨。为了解决这些问题，我们提出了一种新的基于对抗性对比学习和元路径的战术和技术分类模型（TTC-ACLM）。在TTC-ACLM中，首先设计了一种新的文本表示学习模块。它包括预训练语言模型（PLM）和对比对抗方法，可以更好地适应样本量较小的类别，同时获得更好的文本表示。然后，使用异构信息网络对文本和标签（策略和技术）之间的丰富关系进行建模，这些关系可以合并其他信息，例如过程和工具，以解决文本稀疏性问题。接下来，我们定义了一个基于元路径的分类器学习模块，该模块将文本、策略和基于元路径的上下文映射到一组分类器，这些分类器应用于文本表示模块生成的文本表示，以实现更好的分类。最后，通过相关增强矩阵的策略和技术进一步提高分类性能。通过深入研究，我们证明了所提出的模型可以有效地解决样本不平衡和文本稀疏性的影响。大量的实验结果表明，TTC-ACLM达到了最先进的性能。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

求助全文

约1分钟内获得全文求助全文

来源期刊

IEEE Transactions on Information Forensics and Security 工程技术-工程：电子与电气

CiteScore

14.40

自引率

7.40%

发文量

234

审稿时长

6.5 months

期刊介绍： The IEEE Transactions on Information Forensics and Security covers the sciences, technologies, and applications relating to information forensics, information security, biometrics, surveillance and systems applications that incorporate these features