基于DRL和增强LLM的自主网络防御代理的设计与评估

IF 4.4 2区计算机科学 Q1 COMPUTER SCIENCE, HARDWARE & ARCHITECTURE

Computer Networks Pub Date : 2025-03-05 DOI:10.1016/j.comnet.2025.111162

Johannes Loevenich , Erik Adler , Tobias Hürten , Roberto Rigolin F. Lopes

{"title":"基于DRL和增强LLM的自主网络防御代理的设计与评估","authors":"Johannes Loevenich , Erik Adler , Tobias Hürten , Roberto Rigolin F. Lopes","doi":"10.1016/j.comnet.2025.111162","DOIUrl":null,"url":null,"abstract":"<div><div>In this paper, we design and evaluate an Autonomous Cyber Defence (ACD) agent to monitor and act within critical network segments connected to untrusted infrastructure hosting active adversaries. We assume that modern network segments use software-defined controllers with the means to host ACD agents and other cybersecurity tools that implement hybrid AI models. Our agent uses a hybrid AI architecture that integrates deep reinforcement learning (DRL), augmented Large Language Models (LLMs), and rule-based systems. This architecture can be implemented in software-defined network controllers, enabling automated defensive actions such as monitoring, analysis, decoy deployment, service removal, and recovery. A core contribution of our work is the construction of three cybersecurity knowledge graphs that organise and map data from network logs, open source Cyber Threat Intelligence (CTI) reports, and vulnerability frameworks. These graphs enable automatic mapping of Common Vulnerabilities and Exposures (CVEs) to offensive tactics and techniques defined in the MITRE ATT&CK framework using Bidirectional Encoder Representations from Transformers (BERT) and Generative Pre-trained Transformer (GPT) models. Our experimental evaluation of the knowledge graphs shows that BERT-based models perform better, with precision (83.02%), recall (75.92%), and macro F1 scores (58.70%) significantly outperforming GPT models. The ACD agent was evaluated in a Cyber Operations Research (ACO) gym against eleven DRL models, including Proximal Policy Optimisation (PPO), Hierarchical PPO, and ensembles under two different attacker strategies. The results show that our ACD agent outperformed baseline implementations, with its DRL models effectively mitigating attacks and recovering compromised systems. In addition, we implemented and evaluated a chatbot using Retrieval-Augmented Generation (RAG) and a prompting agent augmented with the CTI reports represented in the cybersecurity knowledge graphs. The chatbot achieved high scores on generation metrics such as relevance (0.85), faithfulness (0.83), and semantic similarity (0.88), as well as retrieval metrics such as contextual precision (0.91). The experimental results suggest that the integration of hybrid AI systems with knowledge graphs can enable the automation and improve the precision of cyber defence operations, and also provide a robust interface for cybersecurity experts to interpret and respond to advanced cybersecurity threats.</div></div>","PeriodicalId":50637,"journal":{"name":"Computer Networks","volume":"262 ","pages":"Article 111162"},"PeriodicalIF":4.4000,"publicationDate":"2025-03-05","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":"{\"title\":\"Design and evaluation of an Autonomous Cyber Defence agent using DRL and an augmented LLM\",\"authors\":\"Johannes Loevenich , Erik Adler , Tobias Hürten , Roberto Rigolin F. Lopes\",\"doi\":\"10.1016/j.comnet.2025.111162\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"<div><div>In this paper, we design and evaluate an Autonomous Cyber Defence (ACD) agent to monitor and act within critical network segments connected to untrusted infrastructure hosting active adversaries. We assume that modern network segments use software-defined controllers with the means to host ACD agents and other cybersecurity tools that implement hybrid AI models. Our agent uses a hybrid AI architecture that integrates deep reinforcement learning (DRL), augmented Large Language Models (LLMs), and rule-based systems. This architecture can be implemented in software-defined network controllers, enabling automated defensive actions such as monitoring, analysis, decoy deployment, service removal, and recovery. A core contribution of our work is the construction of three cybersecurity knowledge graphs that organise and map data from network logs, open source Cyber Threat Intelligence (CTI) reports, and vulnerability frameworks. These graphs enable automatic mapping of Common Vulnerabilities and Exposures (CVEs) to offensive tactics and techniques defined in the MITRE ATT&CK framework using Bidirectional Encoder Representations from Transformers (BERT) and Generative Pre-trained Transformer (GPT) models. Our experimental evaluation of the knowledge graphs shows that BERT-based models perform better, with precision (83.02%), recall (75.92%), and macro F1 scores (58.70%) significantly outperforming GPT models. The ACD agent was evaluated in a Cyber Operations Research (ACO) gym against eleven DRL models, including Proximal Policy Optimisation (PPO), Hierarchical PPO, and ensembles under two different attacker strategies. The results show that our ACD agent outperformed baseline implementations, with its DRL models effectively mitigating attacks and recovering compromised systems. In addition, we implemented and evaluated a chatbot using Retrieval-Augmented Generation (RAG) and a prompting agent augmented with the CTI reports represented in the cybersecurity knowledge graphs. The chatbot achieved high scores on generation metrics such as relevance (0.85), faithfulness (0.83), and semantic similarity (0.88), as well as retrieval metrics such as contextual precision (0.91). The experimental results suggest that the integration of hybrid AI systems with knowledge graphs can enable the automation and improve the precision of cyber defence operations, and also provide a robust interface for cybersecurity experts to interpret and respond to advanced cybersecurity threats.</div></div>\",\"PeriodicalId\":50637,\"journal\":{\"name\":\"Computer Networks\",\"volume\":\"262 \",\"pages\":\"Article 111162\"},\"PeriodicalIF\":4.4000,\"publicationDate\":\"2025-03-05\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"0\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"Computer Networks\",\"FirstCategoryId\":\"94\",\"ListUrlMain\":\"https://www.sciencedirect.com/science/article/pii/S1389128625001306\",\"RegionNum\":2,\"RegionCategory\":\"计算机科学\",\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"Q1\",\"JCRName\":\"COMPUTER SCIENCE, HARDWARE & ARCHITECTURE\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"Computer Networks","FirstCategoryId":"94","ListUrlMain":"https://www.sciencedirect.com/science/article/pii/S1389128625001306","RegionNum":2,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q1","JCRName":"COMPUTER SCIENCE, HARDWARE & ARCHITECTURE","Score":null,"Total":0}

引用次数: 0

摘要

在本文中，我们设计并评估了一个自主网络防御（ACD）代理，以监控连接到托管活跃对手的不受信任基础设施的关键网段并在其中采取行动。我们假设现代网段使用软件定义控制器来托管ACD代理和其他实现混合人工智能模型的网络安全工具。我们的智能体使用混合AI架构，集成了深度强化学习（DRL）、增强大型语言模型（llm）和基于规则的系统。这种架构可以在软件定义的网络控制器中实现，支持自动防御操作，如监控、分析、诱饵部署、服务移除和恢复。我们工作的核心贡献是构建了三个网络安全知识图谱，这些图谱组织和映射了来自网络日志、开源网络威胁情报（CTI）报告和漏洞框架的数据。这些图可以使用来自变压器（BERT）和生成式预训练变压器（GPT）模型的双向编码器表示，将常见漏洞和暴露（cve）自动映射到MITRE att&ck框架中定义的攻击战术和技术。我们对知识图的实验评估表明，基于bert的模型表现更好，准确率（83.02%）、召回率（75.92%）和宏观F1分数（58.70%）显著优于GPT模型。ACD代理在网络运筹学（ACO）健身房针对11种DRL模型进行了评估，包括近端策略优化（PPO）、分层PPO和两种不同攻击策略下的集成。结果表明，我们的ACD代理优于基线实现，其DRL模型有效地减轻了攻击并恢复了受损的系统。此外，我们使用检索增强生成（RAG）实现和评估了聊天机器人，并使用网络安全知识图中表示的CTI报告增强了提示代理。聊天机器人在生成指标上取得了高分，如相关性（0.85）、忠实度（0.83）和语义相似性（0.88），以及检索指标，如上下文精度（0.91）。实验结果表明，混合人工智能系统与知识图谱的集成可以实现网络防御操作的自动化和提高精度，也为网络安全专家提供了一个强大的接口来解释和应对高级网络安全威胁。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

查看原文本刊更多论文

Design and evaluation of an Autonomous Cyber Defence agent using DRL and an augmented LLM

In this paper, we design and evaluate an Autonomous Cyber Defence (ACD) agent to monitor and act within critical network segments connected to untrusted infrastructure hosting active adversaries. We assume that modern network segments use software-defined controllers with the means to host ACD agents and other cybersecurity tools that implement hybrid AI models. Our agent uses a hybrid AI architecture that integrates deep reinforcement learning (DRL), augmented Large Language Models (LLMs), and rule-based systems. This architecture can be implemented in software-defined network controllers, enabling automated defensive actions such as monitoring, analysis, decoy deployment, service removal, and recovery. A core contribution of our work is the construction of three cybersecurity knowledge graphs that organise and map data from network logs, open source Cyber Threat Intelligence (CTI) reports, and vulnerability frameworks. These graphs enable automatic mapping of Common Vulnerabilities and Exposures (CVEs) to offensive tactics and techniques defined in the MITRE ATT&CK framework using Bidirectional Encoder Representations from Transformers (BERT) and Generative Pre-trained Transformer (GPT) models. Our experimental evaluation of the knowledge graphs shows that BERT-based models perform better, with precision (83.02%), recall (75.92%), and macro F1 scores (58.70%) significantly outperforming GPT models. The ACD agent was evaluated in a Cyber Operations Research (ACO) gym against eleven DRL models, including Proximal Policy Optimisation (PPO), Hierarchical PPO, and ensembles under two different attacker strategies. The results show that our ACD agent outperformed baseline implementations, with its DRL models effectively mitigating attacks and recovering compromised systems. In addition, we implemented and evaluated a chatbot using Retrieval-Augmented Generation (RAG) and a prompting agent augmented with the CTI reports represented in the cybersecurity knowledge graphs. The chatbot achieved high scores on generation metrics such as relevance (0.85), faithfulness (0.83), and semantic similarity (0.88), as well as retrieval metrics such as contextual precision (0.91). The experimental results suggest that the integration of hybrid AI systems with knowledge graphs can enable the automation and improve the precision of cyber defence operations, and also provide a robust interface for cybersecurity experts to interpret and respond to advanced cybersecurity threats.

求助全文

通过发布文献求助，成功后即可免费获取论文全文。去求助

来源期刊

Computer Networks 工程技术-电信学

CiteScore

10.80

自引率

3.60%

发文量

434

审稿时长

8.6 months

期刊介绍： Computer Networks is an international, archival journal providing a publication vehicle for complete coverage of all topics of interest to those involved in the computer communications networking area. The audience includes researchers, managers and operators of networks as well as designers and implementors. The Editorial Board will consider any material for publication that is of interest to those groups.