{"title":"LGSMOTE-IDS: Line Graph based Weighted-Distance SMOTE for imbalanced network traffic detection","authors":"Guyu Zhao, Linwei Li, Hongdou He, Jiadong Ren","doi":"10.1016/j.eswa.2025.127645","DOIUrl":null,"url":null,"abstract":"<div><div>The application of Graph Neural Networks (GNNs) to Network Intrusion Detection Systems (NIDS) has become a prominent research focus. However, NIDS often struggles to classify minority attack samples due to the severe class imbalance in NIDS datasets, where the number of samples varies significantly across classes. Additionally, prior studies have frequently overlooked the importance of edge features in GNNs. To address these challenges, we propose LGSMOTE-IDS, a novel framework that integrates a <strong><u>L</u></strong>ine <strong><u>G</u></strong>raph based Weighted-Distance <strong><u>SMOTE</u></strong> for <strong><u>I</u></strong>ntrusion <strong><u>D</u></strong>etection <strong><u>S</u></strong>ystems. First, we define the fine-grained protocol service graph (<span><math><mrow><mi>P</mi><mi>S</mi><mi>G</mi></mrow></math></span>) and transform it into its corresponding protocol service line graph (<span><math><mrow><mi>L</mi><mrow><mo>(</mo><mi>P</mi><mi>S</mi><mi>G</mi><mo>)</mo></mrow></mrow></math></span>). This transformation provides a novel perspective for describing network traffic interactions and enables the conversion of the edge classification task into a node classification task. Second, we introduce Weighted-Distance SMOTE, an oversampling algorithm specifically tailored to NIDS datasets, which employs an improved interpolation strategy to generate synthetic minority class samples. Finally, we utilize a GNN-based classifier to predict labels for all samples. We conduct experiments on three widely used datasets—NF-UNSW-NB15, NF-BoT-IoT, and NF-ToN-IoT. LGSMOTE-IDS achieves average increases of 18.11%, 45.91%, and 36.41% in weighted F1-scores for five, one, and three minority classes across the three datasets, respectively, compared to baseline method. Moreover, LGSMOTE-IDS successfully detects attack types that previous models fail to recognize. To the best of our knowledge, LGSMOTE-IDS is the first framework to integrate GNNs with an oversampling algorithm to address the class imbalance issue in NIDS datasets.</div></div>","PeriodicalId":50461,"journal":{"name":"Expert Systems with Applications","volume":"281 ","pages":"Article 127645"},"PeriodicalIF":7.5000,"publicationDate":"2025-04-17","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Expert Systems with Applications","FirstCategoryId":"94","ListUrlMain":"https://www.sciencedirect.com/science/article/pii/S0957417425012679","RegionNum":1,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q1","JCRName":"COMPUTER SCIENCE, ARTIFICIAL INTELLIGENCE","Score":null,"Total":0}
引用次数: 0
Abstract
The application of Graph Neural Networks (GNNs) to Network Intrusion Detection Systems (NIDS) has become a prominent research focus. However, NIDS often struggles to classify minority attack samples due to the severe class imbalance in NIDS datasets, where the number of samples varies significantly across classes. Additionally, prior studies have frequently overlooked the importance of edge features in GNNs. To address these challenges, we propose LGSMOTE-IDS, a novel framework that integrates a Line Graph based Weighted-Distance SMOTE for Intrusion Detection Systems. First, we define the fine-grained protocol service graph () and transform it into its corresponding protocol service line graph (). This transformation provides a novel perspective for describing network traffic interactions and enables the conversion of the edge classification task into a node classification task. Second, we introduce Weighted-Distance SMOTE, an oversampling algorithm specifically tailored to NIDS datasets, which employs an improved interpolation strategy to generate synthetic minority class samples. Finally, we utilize a GNN-based classifier to predict labels for all samples. We conduct experiments on three widely used datasets—NF-UNSW-NB15, NF-BoT-IoT, and NF-ToN-IoT. LGSMOTE-IDS achieves average increases of 18.11%, 45.91%, and 36.41% in weighted F1-scores for five, one, and three minority classes across the three datasets, respectively, compared to baseline method. Moreover, LGSMOTE-IDS successfully detects attack types that previous models fail to recognize. To the best of our knowledge, LGSMOTE-IDS is the first framework to integrate GNNs with an oversampling algorithm to address the class imbalance issue in NIDS datasets.
期刊介绍:
Expert Systems With Applications is an international journal dedicated to the exchange of information on expert and intelligent systems used globally in industry, government, and universities. The journal emphasizes original papers covering the design, development, testing, implementation, and management of these systems, offering practical guidelines. It spans various sectors such as finance, engineering, marketing, law, project management, information management, medicine, and more. The journal also welcomes papers on multi-agent systems, knowledge management, neural networks, knowledge discovery, data mining, and other related areas, excluding applications to military/defense systems.