LGSMOTE-IDS:基于线形图的加权距离SMOTE,用于网络流量不均衡检测

IF 7.5 1区 计算机科学 Q1 COMPUTER SCIENCE, ARTIFICIAL INTELLIGENCE
Guyu Zhao, Linwei Li, Hongdou He, Jiadong Ren
{"title":"LGSMOTE-IDS:基于线形图的加权距离SMOTE,用于网络流量不均衡检测","authors":"Guyu Zhao,&nbsp;Linwei Li,&nbsp;Hongdou He,&nbsp;Jiadong Ren","doi":"10.1016/j.eswa.2025.127645","DOIUrl":null,"url":null,"abstract":"<div><div>The application of Graph Neural Networks (GNNs) to Network Intrusion Detection Systems (NIDS) has become a prominent research focus. However, NIDS often struggles to classify minority attack samples due to the severe class imbalance in NIDS datasets, where the number of samples varies significantly across classes. Additionally, prior studies have frequently overlooked the importance of edge features in GNNs. To address these challenges, we propose LGSMOTE-IDS, a novel framework that integrates a <strong><u>L</u></strong>ine <strong><u>G</u></strong>raph based Weighted-Distance <strong><u>SMOTE</u></strong> for <strong><u>I</u></strong>ntrusion <strong><u>D</u></strong>etection <strong><u>S</u></strong>ystems. First, we define the fine-grained protocol service graph (<span><math><mrow><mi>P</mi><mi>S</mi><mi>G</mi></mrow></math></span>) and transform it into its corresponding protocol service line graph (<span><math><mrow><mi>L</mi><mrow><mo>(</mo><mi>P</mi><mi>S</mi><mi>G</mi><mo>)</mo></mrow></mrow></math></span>). This transformation provides a novel perspective for describing network traffic interactions and enables the conversion of the edge classification task into a node classification task. Second, we introduce Weighted-Distance SMOTE, an oversampling algorithm specifically tailored to NIDS datasets, which employs an improved interpolation strategy to generate synthetic minority class samples. Finally, we utilize a GNN-based classifier to predict labels for all samples. We conduct experiments on three widely used datasets—NF-UNSW-NB15, NF-BoT-IoT, and NF-ToN-IoT. LGSMOTE-IDS achieves average increases of 18.11%, 45.91%, and 36.41% in weighted F1-scores for five, one, and three minority classes across the three datasets, respectively, compared to baseline method. Moreover, LGSMOTE-IDS successfully detects attack types that previous models fail to recognize. To the best of our knowledge, LGSMOTE-IDS is the first framework to integrate GNNs with an oversampling algorithm to address the class imbalance issue in NIDS datasets.</div></div>","PeriodicalId":50461,"journal":{"name":"Expert Systems with Applications","volume":"281 ","pages":"Article 127645"},"PeriodicalIF":7.5000,"publicationDate":"2025-04-17","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":"{\"title\":\"LGSMOTE-IDS: Line Graph based Weighted-Distance SMOTE for imbalanced network traffic detection\",\"authors\":\"Guyu Zhao,&nbsp;Linwei Li,&nbsp;Hongdou He,&nbsp;Jiadong Ren\",\"doi\":\"10.1016/j.eswa.2025.127645\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"<div><div>The application of Graph Neural Networks (GNNs) to Network Intrusion Detection Systems (NIDS) has become a prominent research focus. However, NIDS often struggles to classify minority attack samples due to the severe class imbalance in NIDS datasets, where the number of samples varies significantly across classes. Additionally, prior studies have frequently overlooked the importance of edge features in GNNs. To address these challenges, we propose LGSMOTE-IDS, a novel framework that integrates a <strong><u>L</u></strong>ine <strong><u>G</u></strong>raph based Weighted-Distance <strong><u>SMOTE</u></strong> for <strong><u>I</u></strong>ntrusion <strong><u>D</u></strong>etection <strong><u>S</u></strong>ystems. First, we define the fine-grained protocol service graph (<span><math><mrow><mi>P</mi><mi>S</mi><mi>G</mi></mrow></math></span>) and transform it into its corresponding protocol service line graph (<span><math><mrow><mi>L</mi><mrow><mo>(</mo><mi>P</mi><mi>S</mi><mi>G</mi><mo>)</mo></mrow></mrow></math></span>). This transformation provides a novel perspective for describing network traffic interactions and enables the conversion of the edge classification task into a node classification task. Second, we introduce Weighted-Distance SMOTE, an oversampling algorithm specifically tailored to NIDS datasets, which employs an improved interpolation strategy to generate synthetic minority class samples. Finally, we utilize a GNN-based classifier to predict labels for all samples. We conduct experiments on three widely used datasets—NF-UNSW-NB15, NF-BoT-IoT, and NF-ToN-IoT. LGSMOTE-IDS achieves average increases of 18.11%, 45.91%, and 36.41% in weighted F1-scores for five, one, and three minority classes across the three datasets, respectively, compared to baseline method. Moreover, LGSMOTE-IDS successfully detects attack types that previous models fail to recognize. To the best of our knowledge, LGSMOTE-IDS is the first framework to integrate GNNs with an oversampling algorithm to address the class imbalance issue in NIDS datasets.</div></div>\",\"PeriodicalId\":50461,\"journal\":{\"name\":\"Expert Systems with Applications\",\"volume\":\"281 \",\"pages\":\"Article 127645\"},\"PeriodicalIF\":7.5000,\"publicationDate\":\"2025-04-17\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"0\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"Expert Systems with Applications\",\"FirstCategoryId\":\"94\",\"ListUrlMain\":\"https://www.sciencedirect.com/science/article/pii/S0957417425012679\",\"RegionNum\":1,\"RegionCategory\":\"计算机科学\",\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"Q1\",\"JCRName\":\"COMPUTER SCIENCE, ARTIFICIAL INTELLIGENCE\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"Expert Systems with Applications","FirstCategoryId":"94","ListUrlMain":"https://www.sciencedirect.com/science/article/pii/S0957417425012679","RegionNum":1,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q1","JCRName":"COMPUTER SCIENCE, ARTIFICIAL INTELLIGENCE","Score":null,"Total":0}
引用次数: 0

摘要

图神经网络(GNNs)在网络入侵检测系统(NIDS)中的应用已成为一个突出的研究热点。然而,由于NIDS数据集中严重的类别不平衡,NIDS经常难以对少数攻击样本进行分类,其中样本数量在不同类别之间差异很大。此外,先前的研究经常忽略了gnn中边缘特征的重要性。为了解决这些挑战,我们提出了LGSMOTE-IDS,这是一个集成了基于线形图的加权距离SMOTE的入侵检测系统的新框架。首先,我们定义了细粒度协议服务图(PSG),并将其转换为相应的协议服务线图(L(PSG))。这种转换为描述网络流量交互提供了一种新颖的视角,并支持将边缘分类任务转换为节点分类任务。其次,我们引入了加权距离SMOTE,这是一种专门针对NIDS数据集的过采样算法,它采用改进的插值策略来生成合成的少数类样本。最后,我们利用基于gnn的分类器来预测所有样本的标签。我们在三个广泛使用的数据集- nf - unsw - nb15, NF-BoT-IoT和NF-ToN-IoT上进行了实验。与基线方法相比,LGSMOTE-IDS在三个数据集上的5个、1个和3个少数族裔的加权f1分数平均分别提高了18.11%、45.91%和36.41%。此外,LGSMOTE-IDS还可以成功检测到以前的模型无法识别的攻击类型。据我们所知,LGSMOTE-IDS是第一个将gnn与过采样算法相结合的框架,以解决NIDS数据集中的类不平衡问题。
本文章由计算机程序翻译,如有差异,请以英文原文为准。

LGSMOTE-IDS: Line Graph based Weighted-Distance SMOTE for imbalanced network traffic detection

LGSMOTE-IDS: Line Graph based Weighted-Distance SMOTE for imbalanced network traffic detection
The application of Graph Neural Networks (GNNs) to Network Intrusion Detection Systems (NIDS) has become a prominent research focus. However, NIDS often struggles to classify minority attack samples due to the severe class imbalance in NIDS datasets, where the number of samples varies significantly across classes. Additionally, prior studies have frequently overlooked the importance of edge features in GNNs. To address these challenges, we propose LGSMOTE-IDS, a novel framework that integrates a Line Graph based Weighted-Distance SMOTE for Intrusion Detection Systems. First, we define the fine-grained protocol service graph (PSG) and transform it into its corresponding protocol service line graph (L(PSG)). This transformation provides a novel perspective for describing network traffic interactions and enables the conversion of the edge classification task into a node classification task. Second, we introduce Weighted-Distance SMOTE, an oversampling algorithm specifically tailored to NIDS datasets, which employs an improved interpolation strategy to generate synthetic minority class samples. Finally, we utilize a GNN-based classifier to predict labels for all samples. We conduct experiments on three widely used datasets—NF-UNSW-NB15, NF-BoT-IoT, and NF-ToN-IoT. LGSMOTE-IDS achieves average increases of 18.11%, 45.91%, and 36.41% in weighted F1-scores for five, one, and three minority classes across the three datasets, respectively, compared to baseline method. Moreover, LGSMOTE-IDS successfully detects attack types that previous models fail to recognize. To the best of our knowledge, LGSMOTE-IDS is the first framework to integrate GNNs with an oversampling algorithm to address the class imbalance issue in NIDS datasets.
求助全文
通过发布文献求助,成功后即可免费获取论文全文。 去求助
来源期刊
Expert Systems with Applications
Expert Systems with Applications 工程技术-工程:电子与电气
CiteScore
13.80
自引率
10.60%
发文量
2045
审稿时长
8.7 months
期刊介绍: Expert Systems With Applications is an international journal dedicated to the exchange of information on expert and intelligent systems used globally in industry, government, and universities. The journal emphasizes original papers covering the design, development, testing, implementation, and management of these systems, offering practical guidelines. It spans various sectors such as finance, engineering, marketing, law, project management, information management, medicine, and more. The journal also welcomes papers on multi-agent systems, knowledge management, neural networks, knowledge discovery, data mining, and other related areas, excluding applications to military/defense systems.
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
确定
请完成安全验证×
copy
已复制链接
快去分享给好友吧!
我知道了
右上角分享
点击右上角分享
0
联系我们:info@booksci.cn Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。 Copyright © 2023 布克学术 All rights reserved.
京ICP备2023020795号-1
ghs 京公网安备 11010802042870号
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术官方微信