一个正式的框架,llm辅助从二进制工件自动生成Zeek签名

IF 6.2 2区 计算机科学 Q1 COMPUTER SCIENCE, THEORY & METHODS
Claudia Greco , Michele Ianni
{"title":"一个正式的框架,llm辅助从二进制工件自动生成Zeek签名","authors":"Claudia Greco ,&nbsp;Michele Ianni","doi":"10.1016/j.future.2025.108086","DOIUrl":null,"url":null,"abstract":"<div><div>Designing semantically meaningful and operationally effective intrusion detection signatures remains a labor-intensive and expertise-driven task, particularly within the Zeek network monitoring framework. In this paper, we introduce a formalized and modular system for automating Zeek signature generation using Large Language Models (LLMs). Our pipeline begins with static analysis of binary artifacts, extracts salient behavioral features, and transforms them into structured prompts for an LLM tasked with synthesizing Zeek scripts. We provide a rigorous formal framework that defines each stage of this transformation, along with theoretical models for prompt distortion, injection resilience, and sanitization. Furthermore, we explore the adversarial surface exposed by LLMs—introducing a taxonomy of injection attacks, prompt inversion risks, and behavioral feedback loops—and propose mitigations grounded in filtering and robust prompt engineering. Our approach not only accelerates signature creation but also enhances interpretability and adaptability in evolving threat environments. The framework lays the groundwork for future extensions involving dynamic analysis and automated post-validation of generated signatures.</div></div>","PeriodicalId":55132,"journal":{"name":"Future Generation Computer Systems-The International Journal of Escience","volume":"175 ","pages":"Article 108086"},"PeriodicalIF":6.2000,"publicationDate":"2025-08-16","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":"{\"title\":\"A formal framework for LLM-assisted automated generation of Zeek signatures from binary artifacts\",\"authors\":\"Claudia Greco ,&nbsp;Michele Ianni\",\"doi\":\"10.1016/j.future.2025.108086\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"<div><div>Designing semantically meaningful and operationally effective intrusion detection signatures remains a labor-intensive and expertise-driven task, particularly within the Zeek network monitoring framework. In this paper, we introduce a formalized and modular system for automating Zeek signature generation using Large Language Models (LLMs). Our pipeline begins with static analysis of binary artifacts, extracts salient behavioral features, and transforms them into structured prompts for an LLM tasked with synthesizing Zeek scripts. We provide a rigorous formal framework that defines each stage of this transformation, along with theoretical models for prompt distortion, injection resilience, and sanitization. Furthermore, we explore the adversarial surface exposed by LLMs—introducing a taxonomy of injection attacks, prompt inversion risks, and behavioral feedback loops—and propose mitigations grounded in filtering and robust prompt engineering. Our approach not only accelerates signature creation but also enhances interpretability and adaptability in evolving threat environments. The framework lays the groundwork for future extensions involving dynamic analysis and automated post-validation of generated signatures.</div></div>\",\"PeriodicalId\":55132,\"journal\":{\"name\":\"Future Generation Computer Systems-The International Journal of Escience\",\"volume\":\"175 \",\"pages\":\"Article 108086\"},\"PeriodicalIF\":6.2000,\"publicationDate\":\"2025-08-16\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"0\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"Future Generation Computer Systems-The International Journal of Escience\",\"FirstCategoryId\":\"94\",\"ListUrlMain\":\"https://www.sciencedirect.com/science/article/pii/S0167739X25003802\",\"RegionNum\":2,\"RegionCategory\":\"计算机科学\",\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"Q1\",\"JCRName\":\"COMPUTER SCIENCE, THEORY & METHODS\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"Future Generation Computer Systems-The International Journal of Escience","FirstCategoryId":"94","ListUrlMain":"https://www.sciencedirect.com/science/article/pii/S0167739X25003802","RegionNum":2,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q1","JCRName":"COMPUTER SCIENCE, THEORY & METHODS","Score":null,"Total":0}
引用次数: 0

摘要

设计语义上有意义和操作上有效的入侵检测签名仍然是一项劳动密集型和专业知识驱动的任务,特别是在Zeek网络监控框架中。在本文中,我们介绍了一个形式化和模块化的系统,用于使用大型语言模型(llm)自动生成Zeek签名。我们的管道从二进制工件的静态分析开始,提取显著的行为特征,并将它们转换为结构化的提示,供负责合成Zeek脚本的LLM使用。我们提供了一个严格的正式框架,定义了这种转变的每个阶段,以及提示变形、注射弹性和消毒的理论模型。此外,我们探索了llms暴露的对抗表面,介绍了注入攻击的分类、提示反转风险和行为反馈回路,并提出了基于过滤和鲁棒提示工程的缓解措施。我们的方法不仅加快了签名的生成速度,而且提高了签名在不断变化的威胁环境中的可解释性和适应性。该框架为涉及动态分析和生成签名的自动后验证的未来扩展奠定了基础。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
A formal framework for LLM-assisted automated generation of Zeek signatures from binary artifacts
Designing semantically meaningful and operationally effective intrusion detection signatures remains a labor-intensive and expertise-driven task, particularly within the Zeek network monitoring framework. In this paper, we introduce a formalized and modular system for automating Zeek signature generation using Large Language Models (LLMs). Our pipeline begins with static analysis of binary artifacts, extracts salient behavioral features, and transforms them into structured prompts for an LLM tasked with synthesizing Zeek scripts. We provide a rigorous formal framework that defines each stage of this transformation, along with theoretical models for prompt distortion, injection resilience, and sanitization. Furthermore, we explore the adversarial surface exposed by LLMs—introducing a taxonomy of injection attacks, prompt inversion risks, and behavioral feedback loops—and propose mitigations grounded in filtering and robust prompt engineering. Our approach not only accelerates signature creation but also enhances interpretability and adaptability in evolving threat environments. The framework lays the groundwork for future extensions involving dynamic analysis and automated post-validation of generated signatures.
求助全文
通过发布文献求助,成功后即可免费获取论文全文。 去求助
来源期刊
CiteScore
19.90
自引率
2.70%
发文量
376
审稿时长
10.6 months
期刊介绍: Computing infrastructures and systems are constantly evolving, resulting in increasingly complex and collaborative scientific applications. To cope with these advancements, there is a growing need for collaborative tools that can effectively map, control, and execute these applications. Furthermore, with the explosion of Big Data, there is a requirement for innovative methods and infrastructures to collect, analyze, and derive meaningful insights from the vast amount of data generated. This necessitates the integration of computational and storage capabilities, databases, sensors, and human collaboration. Future Generation Computer Systems aims to pioneer advancements in distributed systems, collaborative environments, high-performance computing, and Big Data analytics. It strives to stay at the forefront of developments in grids, clouds, and the Internet of Things (IoT) to effectively address the challenges posed by these wide-area, fully distributed sensing and computing systems.
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
确定
请完成安全验证×
copy
已复制链接
快去分享给好友吧!
我知道了
右上角分享
点击右上角分享
0
联系我们:info@booksci.cn Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。 Copyright © 2023 布克学术 All rights reserved.
京ICP备2023020795号-1
ghs 京公网安备 11010802042870号
Book学术文献互助
Book学术文献互助群
群 号:604180095
Book学术官方微信