ResDNViT: A hybrid architecture for Netflow-based attack detection using a residual dense network and Vision Transformer

IF 7.5 1区计算机科学 Q1 COMPUTER SCIENCE, ARTIFICIAL INTELLIGENCE

Expert Systems with Applications Pub Date : 2025-04-16 DOI:10.1016/j.eswa.2025.127504

Hassan Wasswa, Hussein A. Abbass, Timothy Lynar

{"title":"ResDNViT: A hybrid architecture for Netflow-based attack detection using a residual dense network and Vision Transformer","authors":"Hassan Wasswa, Hussein A. Abbass, Timothy Lynar","doi":"10.1016/j.eswa.2025.127504","DOIUrl":null,"url":null,"abstract":"<div><div>The fast evolution of technologies like wireless sensor networks, cloud computing services, advanced AI driven applications and the Internet of Things (IoT) have led to increased reliance on internet by both individual users and enterprises—both small and large. On the contrary, the advancements in cybersecurity have not matched this pace consequently attracting exponentially rising trends of cyberattacks in the past decade. To enhance network security, this work proposes ResDNViT, a robust model integrating a self-attention-based Vision Transformer (ViT) architecture with a simplified ResNet-based architecture for NetFlow-based attack detection. Motivated by the strong performance of transformers in tasks related to NLP and computer vision, ResDNViT extends the ViT-based architecture for network traffic analysis by expressing NetFlow features as 2D matrices, and splitting them into equal-sized sub-matrices, that are used as input patches for the encoder component. A simplified residual dense network (ResDN) with two residual dense blocks (RDB) is stacked to the encoder’s output layer for classification. The novelty of this approach lies in effectively adapting the ViT-based architecture, originally designed for images, to analyzing NetFlow packets for attack classification. The model was evaluated on four well-studied benchmark datasets: the CICIDS2017_improved, Bot-IoT, CICIoT2022, and N-BaIoT, demonstrating an impressive performance across various classification tasks. The proposed approach’s ability to detect traffic from unseen device kinds was assessed by grouping devices from N-BaIoT into five categories based on usage: Thermostats, Baby Monitors, Doorbells, Security Cameras and Webcams. The model was trained using samples from four categories at a time and tested on samples from the remaining category. A high performance across metrics including accuracy, precision, recall, and F1-score for all categories highlighted the model’s robustness in traffic discrimination.</div></div>","PeriodicalId":50461,"journal":{"name":"Expert Systems with Applications","volume":"282 ","pages":"Article 127504"},"PeriodicalIF":7.5000,"publicationDate":"2025-04-16","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Expert Systems with Applications","FirstCategoryId":"94","ListUrlMain":"https://www.sciencedirect.com/science/article/pii/S0957417425011261","RegionNum":1,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q1","JCRName":"COMPUTER SCIENCE, ARTIFICIAL INTELLIGENCE","Score":null,"Total":0}

引用次数: 0

Abstract

The fast evolution of technologies like wireless sensor networks, cloud computing services, advanced AI driven applications and the Internet of Things (IoT) have led to increased reliance on internet by both individual users and enterprises—both small and large. On the contrary, the advancements in cybersecurity have not matched this pace consequently attracting exponentially rising trends of cyberattacks in the past decade. To enhance network security, this work proposes ResDNViT, a robust model integrating a self-attention-based Vision Transformer (ViT) architecture with a simplified ResNet-based architecture for NetFlow-based attack detection. Motivated by the strong performance of transformers in tasks related to NLP and computer vision, ResDNViT extends the ViT-based architecture for network traffic analysis by expressing NetFlow features as 2D matrices, and splitting them into equal-sized sub-matrices, that are used as input patches for the encoder component. A simplified residual dense network (ResDN) with two residual dense blocks (RDB) is stacked to the encoder’s output layer for classification. The novelty of this approach lies in effectively adapting the ViT-based architecture, originally designed for images, to analyzing NetFlow packets for attack classification. The model was evaluated on four well-studied benchmark datasets: the CICIDS2017_improved, Bot-IoT, CICIoT2022, and N-BaIoT, demonstrating an impressive performance across various classification tasks. The proposed approach’s ability to detect traffic from unseen device kinds was assessed by grouping devices from N-BaIoT into five categories based on usage: Thermostats, Baby Monitors, Doorbells, Security Cameras and Webcams. The model was trained using samples from four categories at a time and tested on samples from the remaining category. A high performance across metrics including accuracy, precision, recall, and F1-score for all categories highlighted the model’s robustness in traffic discrimination.

查看原文本刊更多论文

ResDNViT：基于netflow的攻击检测的混合架构，使用残差密集网络和视觉转换器

无线传感器网络、云计算服务、先进的人工智能驱动应用和物联网（IoT）等技术的快速发展，导致个人用户和企业（无论大小）越来越依赖互联网。相反，网络安全的进步并没有跟上这一步伐，因此在过去十年中，网络攻击呈指数级上升趋势。为了增强网络安全性，本研究提出了ResDNViT，这是一种鲁棒模型，将基于自注意力的视觉转换器（ViT）架构与用于基于netflow的攻击检测的简化的基于resnet的架构集成在一起。由于变压器在与NLP和计算机视觉相关的任务中的强大性能，ResDNViT扩展了基于vit的网络流量分析架构，将NetFlow特征表示为2D矩阵，并将其分割为大小相等的子矩阵，用作编码器组件的输入补丁。将具有两个残差密集块（RDB）的简化残差密集网络（ResDN）堆叠到编码器的输出层进行分类。这种方法的新颖之处在于，它有效地调整了最初为图像设计的基于vit的体系结构，以分析NetFlow数据包以进行攻击分类。该模型在四个经过充分研究的基准数据集（CICIDS2017_improved、Bot-IoT、CICIoT2022和N-BaIoT）上进行了评估，在各种分类任务中展示了令人印象深刻的性能。通过将N-BaIoT的设备根据使用情况分为五类：恒温器、婴儿监视器、门铃、安全摄像头和网络摄像头，评估了该方法检测未知设备流量的能力。该模型一次使用来自四个类别的样本进行训练，并对来自其余类别的样本进行测试。所有类别的准确性、精度、召回率和f1分数等指标的高性能突出了该模型在交通区分方面的稳健性。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

求助全文

约1分钟内获得全文求助全文

来源期刊

Expert Systems with Applications 工程技术-工程：电子与电气

CiteScore

13.80

自引率

10.60%

发文量

2045

审稿时长

8.7 months

期刊介绍： Expert Systems With Applications is an international journal dedicated to the exchange of information on expert and intelligent systems used globally in industry, government, and universities. The journal emphasizes original papers covering the design, development, testing, implementation, and management of these systems, offering practical guidelines. It spans various sectors such as finance, engineering, marketing, law, project management, information management, medicine, and more. The journal also welcomes papers on multi-agent systems, knowledge management, neural networks, knowledge discovery, data mining, and other related areas, excluding applications to military/defense systems.