Optimising AI models for intelligence extraction in the life cycle of Cybersecurity Threat Landscape generation

IF 3.8 2区 计算机科学 Q2 COMPUTER SCIENCE, INFORMATION SYSTEMS
Alexandros Zacharis , Razvan Gavrila , Constantinos Patsakis , Christos Douligeris
{"title":"Optimising AI models for intelligence extraction in the life cycle of Cybersecurity Threat Landscape generation","authors":"Alexandros Zacharis ,&nbsp;Razvan Gavrila ,&nbsp;Constantinos Patsakis ,&nbsp;Christos Douligeris","doi":"10.1016/j.jisa.2025.104037","DOIUrl":null,"url":null,"abstract":"<div><div>The increasing complexity and frequency of cyber attacks in the modern digital environment demand continuous vigilance and proactive strategies to manage risks effectively. Conventional approaches to generating intelligence for Cybersecurity Threat Landscape (CTL) reports are often resource-intensive and time-consuming, as they depend on manual identification, collection, and analysis of relevant electronically stored information (ESI). This study investigates the potential of artificial intelligence (AI) to transform CTL generation, reducing manual classification and tagging while improving efficiency and accuracy.</div><div>We focus on evaluating the classification performance of several Large Language Models (LLMs), including Gemini 1.5 Pro, GPT-4o, but also Bidirectional Encoder Representations from Transformers (BERT) based models like TRAM and TTPHunter along with custom Named Entity Recognition (NER) models, using a dataset previously annotated by human experts. Our findings demonstrate the promising results of AI-driven intelligence extraction for CTL report generation, streamlining cybersecurity operations by automating routine tasks and providing precise and timely threat intelligence. However, the variability in model performance suggests the importance of hybrid approaches needed to achieve the accuracy of human annotation. Therefore, we propose a novel voting agreement-based methodology, harvesting the most from the combined AI model capabilities to effectively address the complexities of cybersecurity threat intelligence extraction.</div></div>","PeriodicalId":48638,"journal":{"name":"Journal of Information Security and Applications","volume":"90 ","pages":"Article 104037"},"PeriodicalIF":3.8000,"publicationDate":"2025-04-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Journal of Information Security and Applications","FirstCategoryId":"94","ListUrlMain":"https://www.sciencedirect.com/science/article/pii/S2214212625000754","RegionNum":2,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q2","JCRName":"COMPUTER SCIENCE, INFORMATION SYSTEMS","Score":null,"Total":0}
引用次数: 0

Abstract

The increasing complexity and frequency of cyber attacks in the modern digital environment demand continuous vigilance and proactive strategies to manage risks effectively. Conventional approaches to generating intelligence for Cybersecurity Threat Landscape (CTL) reports are often resource-intensive and time-consuming, as they depend on manual identification, collection, and analysis of relevant electronically stored information (ESI). This study investigates the potential of artificial intelligence (AI) to transform CTL generation, reducing manual classification and tagging while improving efficiency and accuracy.
We focus on evaluating the classification performance of several Large Language Models (LLMs), including Gemini 1.5 Pro, GPT-4o, but also Bidirectional Encoder Representations from Transformers (BERT) based models like TRAM and TTPHunter along with custom Named Entity Recognition (NER) models, using a dataset previously annotated by human experts. Our findings demonstrate the promising results of AI-driven intelligence extraction for CTL report generation, streamlining cybersecurity operations by automating routine tasks and providing precise and timely threat intelligence. However, the variability in model performance suggests the importance of hybrid approaches needed to achieve the accuracy of human annotation. Therefore, we propose a novel voting agreement-based methodology, harvesting the most from the combined AI model capabilities to effectively address the complexities of cybersecurity threat intelligence extraction.
优化人工智能模型,在网络安全威胁景观生成的生命周期中进行智能提取
在现代数字环境中,网络攻击的复杂性和频率不断增加,这就要求我们时刻保持警惕,并采取积极主动的策略来有效管理风险。为网络安全威胁态势(CTL)报告生成情报的传统方法往往需要大量资源和时间,因为它们依赖于人工识别、收集和分析相关的电子存储信息(ESI)。本研究调查了人工智能(AI)在改变 CTL 生成、减少人工分类和标记、提高效率和准确性方面的潜力。我们重点评估了几种大型语言模型(LLM)的分类性能,包括 Gemini 1.5 Pro、GPT-4o,以及 TRAM 和 TTPHunter 等基于变换器的双向编码器表示(BERT)模型和定制的命名实体识别(NER)模型。我们的研究结果表明,人工智能驱动的情报提取在 CTL 报告生成方面取得了可喜的成果,通过自动化常规任务和提供准确及时的威胁情报,简化了网络安全操作。然而,模型性能的可变性表明,要达到人工标注的准确性,必须采用混合方法。因此,我们提出了一种新颖的基于投票协议的方法,充分利用人工智能模型的综合能力,有效解决网络安全威胁情报提取的复杂性。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
求助全文
约1分钟内获得全文 求助全文
来源期刊
Journal of Information Security and Applications
Journal of Information Security and Applications Computer Science-Computer Networks and Communications
CiteScore
10.90
自引率
5.40%
发文量
206
审稿时长
56 days
期刊介绍: Journal of Information Security and Applications (JISA) focuses on the original research and practice-driven applications with relevance to information security and applications. JISA provides a common linkage between a vibrant scientific and research community and industry professionals by offering a clear view on modern problems and challenges in information security, as well as identifying promising scientific and "best-practice" solutions. JISA issues offer a balance between original research work and innovative industrial approaches by internationally renowned information security experts and researchers.
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
确定
请完成安全验证×
copy
已复制链接
快去分享给好友吧!
我知道了
右上角分享
点击右上角分享
0
联系我们:info@booksci.cn Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。 Copyright © 2023 布克学术 All rights reserved.
京ICP备2023020795号-1
ghs 京公网安备 11010802042870号
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术官方微信