GPT on the wire: Towards realistic network traffic conversations generated with large language models

IF 4.4 2区计算机科学 Q1 COMPUTER SCIENCE, HARDWARE & ARCHITECTURE

Computer Networks Pub Date : 2025-04-30 DOI:10.1016/j.comnet.2025.111308

Javier Aday Delgado-Soto, Jorge E. López de Vergara, Iván González, Daniel Perdices, Luis de Pedro

{"title":"GPT on the wire: Towards realistic network traffic conversations generated with large language models","authors":"Javier Aday Delgado-Soto, Jorge E. López de Vergara, Iván González, Daniel Perdices, Luis de Pedro","doi":"10.1016/j.comnet.2025.111308","DOIUrl":null,"url":null,"abstract":"<div><div>Realistic network traffic generation is essential for evaluating the performance, security, and scalability of modern communication systems. Traditional methods, such as traffic replay systems and statistical models, while useful, often fall short in capturing the complexity and variability of real-world network scenarios. Recent advancements in Artificial Intelligence (AI), especially Large Language Models (LLMs) like ChatGPT, have introduced new approaches to synthetic traffic generation. This paper presents a novel architecture using OpenAI’s GPT-3.5 Turbo to generate synthetic network traffic, with a focus on creating multi-protocol conversations that are indistinguishable from real-world interactions. Through fine-tuning and prompt engineering, the proposed system successfully generates packet- and conversation-level network traffic for ICMP, ARP, DNS, TCP and HTTP protocols. Additionally, by integrating a Mixture of Experts (MoE) architecture, this model simulates real-world network conversations with high accuracy, being able to generate a conversation combining ARP, DNS, TCP and HTTP without packet or protocol errors. The results show how the application of LLMs in network traffic generation improves realism and adaptability, establishing this approach as a valuable tool for future security testing and network performance evaluation. In addition, the proposed methodology is easily adaptable to other LLMs available both through APIs and to be downloaded and executed on your own computer.</div></div>","PeriodicalId":50637,"journal":{"name":"Computer Networks","volume":"265 ","pages":"Article 111308"},"PeriodicalIF":4.4000,"publicationDate":"2025-04-30","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Computer Networks","FirstCategoryId":"94","ListUrlMain":"https://www.sciencedirect.com/science/article/pii/S1389128625002762","RegionNum":2,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q1","JCRName":"COMPUTER SCIENCE, HARDWARE & ARCHITECTURE","Score":null,"Total":0}

引用次数: 0

Abstract

Realistic network traffic generation is essential for evaluating the performance, security, and scalability of modern communication systems. Traditional methods, such as traffic replay systems and statistical models, while useful, often fall short in capturing the complexity and variability of real-world network scenarios. Recent advancements in Artificial Intelligence (AI), especially Large Language Models (LLMs) like ChatGPT, have introduced new approaches to synthetic traffic generation. This paper presents a novel architecture using OpenAI’s GPT-3.5 Turbo to generate synthetic network traffic, with a focus on creating multi-protocol conversations that are indistinguishable from real-world interactions. Through fine-tuning and prompt engineering, the proposed system successfully generates packet- and conversation-level network traffic for ICMP, ARP, DNS, TCP and HTTP protocols. Additionally, by integrating a Mixture of Experts (MoE) architecture, this model simulates real-world network conversations with high accuracy, being able to generate a conversation combining ARP, DNS, TCP and HTTP without packet or protocol errors. The results show how the application of LLMs in network traffic generation improves realism and adaptability, establishing this approach as a valuable tool for future security testing and network performance evaluation. In addition, the proposed methodology is easily adaptable to other LLMs available both through APIs and to be downloaded and executed on your own computer.

查看原文本刊更多论文

在线上的GPT：面向使用大型语言模型生成的实际网络流量对话

真实的网络流量生成对于评估现代通信系统的性能、安全性和可扩展性至关重要。传统的方法，如流量回放系统和统计模型，虽然有用，但在捕捉现实世界网络场景的复杂性和可变性方面往往不足。人工智能（AI）的最新进展，特别是像ChatGPT这样的大型语言模型（llm），为合成流量生成引入了新的方法。本文介绍了一种使用OpenAI的GPT-3.5 Turbo生成合成网络流量的新架构，重点是创建与现实世界交互无法区分的多协议对话。通过微调和快速工程，该系统成功生成了ICMP、ARP、DNS、TCP和HTTP协议的分组级和会话级网络流量。此外，通过集成混合专家（MoE）架构，该模型以高精度模拟现实世界的网络会话，能够生成结合ARP， DNS， TCP和HTTP的会话，而不会出现数据包或协议错误。结果表明，llm在网络流量生成中的应用提高了现实性和适应性，使该方法成为未来安全测试和网络性能评估的宝贵工具。此外，所提出的方法很容易通过api适用于其他llm，并且可以在您自己的计算机上下载和执行。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

求助全文

约1分钟内获得全文求助全文

来源期刊

Computer Networks 工程技术-电信学

CiteScore

10.80

自引率

3.60%

发文量

434

审稿时长

8.6 months

期刊介绍： Computer Networks is an international, archival journal providing a publication vehicle for complete coverage of all topics of interest to those involved in the computer communications networking area. The audience includes researchers, managers and operators of networks as well as designers and implementors. The Editorial Board will consider any material for publication that is of interest to those groups.