{"title":"入侵检测系统的级联机器学习方法与数据增强","authors":"Argha Chandra Dhar, Arna Roy, M. Akhand, Md Abdus Samad Kamal, Kou Yamada","doi":"10.5815/ijcnis.2024.04.02","DOIUrl":null,"url":null,"abstract":"Cybersecurity has received significant attention globally, with the ever-continuing expansion of internet usage, due to growing trends and adverse impacts of cybercrimes, which include disrupting businesses, corrupting or altering sensitive data, stealing or exposing information, and illegally accessing a computer network. As a popular way, different kinds of firewalls, antivirus systems, and Intrusion Detection Systems (IDS) have been introduced to protect a network from such attacks. Recently, Machine Learning (ML), including Deep Learning (DL) based autonomous systems, have been state-of-the-art in cyber security, along with their drastic growth and superior performance. This study aims to develop a novel IDS system that gives more attention to classifying attack cases correctly and categorizes attacks into subclass levels by proposing a two-step process with a cascaded framework. The proposed framework recognizes the attacks using one ML model and classifies them into subclass levels using the other ML model in successive operations. The most challenging part is to train both models with unbalanced cases of attacks and non-attacks in the datasets, which is overcome by proposing a data augmentation technique. Precisely, limited attack samples of the dataset are augmented in the training set to learn the attack cases properly. Finally, the proposed framework is implemented with NN, the most popular ML model, and evaluated with the NSL-KDD dataset by conducting a rigorous analysis of each subclass emphasizing the major attack class. The proficiency of the proposed cascaded approach with data augmentation is compared with the other three models: the cascaded model without data augmentation and the standard single NN model with and without the data augmentation technique. Experimental results on the NSL-KDD dataset have revealed the proposed method as an effective IDS system and outperformed existing state-of-the-art ML models.","PeriodicalId":36488,"journal":{"name":"International Journal of Computer Network and Information Security","volume":"59 47","pages":""},"PeriodicalIF":0.0000,"publicationDate":"2024-08-08","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":"{\"title\":\"Cascaded Machine Learning Approach with Data Augmentation for Intrusion Detection System\",\"authors\":\"Argha Chandra Dhar, Arna Roy, M. Akhand, Md Abdus Samad Kamal, Kou Yamada\",\"doi\":\"10.5815/ijcnis.2024.04.02\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"Cybersecurity has received significant attention globally, with the ever-continuing expansion of internet usage, due to growing trends and adverse impacts of cybercrimes, which include disrupting businesses, corrupting or altering sensitive data, stealing or exposing information, and illegally accessing a computer network. As a popular way, different kinds of firewalls, antivirus systems, and Intrusion Detection Systems (IDS) have been introduced to protect a network from such attacks. Recently, Machine Learning (ML), including Deep Learning (DL) based autonomous systems, have been state-of-the-art in cyber security, along with their drastic growth and superior performance. This study aims to develop a novel IDS system that gives more attention to classifying attack cases correctly and categorizes attacks into subclass levels by proposing a two-step process with a cascaded framework. The proposed framework recognizes the attacks using one ML model and classifies them into subclass levels using the other ML model in successive operations. The most challenging part is to train both models with unbalanced cases of attacks and non-attacks in the datasets, which is overcome by proposing a data augmentation technique. Precisely, limited attack samples of the dataset are augmented in the training set to learn the attack cases properly. Finally, the proposed framework is implemented with NN, the most popular ML model, and evaluated with the NSL-KDD dataset by conducting a rigorous analysis of each subclass emphasizing the major attack class. The proficiency of the proposed cascaded approach with data augmentation is compared with the other three models: the cascaded model without data augmentation and the standard single NN model with and without the data augmentation technique. Experimental results on the NSL-KDD dataset have revealed the proposed method as an effective IDS system and outperformed existing state-of-the-art ML models.\",\"PeriodicalId\":36488,\"journal\":{\"name\":\"International Journal of Computer Network and Information Security\",\"volume\":\"59 47\",\"pages\":\"\"},\"PeriodicalIF\":0.0000,\"publicationDate\":\"2024-08-08\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"0\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"International Journal of Computer Network and Information Security\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/10.5815/ijcnis.2024.04.02\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"Q1\",\"JCRName\":\"Mathematics\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"International Journal of Computer Network and Information Security","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.5815/ijcnis.2024.04.02","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q1","JCRName":"Mathematics","Score":null,"Total":0}
引用次数: 0
摘要
随着互联网使用的不断扩大,网络犯罪的趋势和负面影响日益严重,包括扰乱业务、破坏或篡改敏感数据、窃取或暴露信息以及非法访问计算机网络,因此网络安全在全球范围内受到极大关注。作为一种流行的方式,不同类型的防火墙、防病毒系统和入侵检测系统(IDS)已被引入,以保护网络免受此类攻击。最近,机器学习(ML),包括基于深度学习(DL)的自主系统,在网络安全领域成为最先进的技术,其发展迅猛,性能优越。本研究旨在开发一种新型 IDS 系统,该系统更注重正确分类攻击案例,并通过级联框架提出一个两步流程,将攻击分为子类级别。提议的框架使用一个 ML 模型识别攻击,并在连续操作中使用另一个 ML 模型将攻击分类为子类级别。最具挑战性的部分是在数据集中攻击和非攻击情况不平衡的情况下训练这两个模型,通过提出数据增强技术克服了这一难题。确切地说,在训练集中增加了数据集中有限的攻击样本,以正确学习攻击案例。最后,利用最流行的 ML 模型 NN 实现了所提出的框架,并通过 NSL-KDD 数据集对每个子类进行了严格的分析评估,强调了主要的攻击类别。与其他三种模型(不带数据增强的级联模型和带或不带数据增强技术的标准单一 NN 模型)相比,建议的级联方法在数据增强方面的能力更强。在 NSL-KDD 数据集上的实验结果表明,所提出的方法是一种有效的 IDS 系统,其性能优于现有的最先进的 ML 模型。
Cascaded Machine Learning Approach with Data Augmentation for Intrusion Detection System
Cybersecurity has received significant attention globally, with the ever-continuing expansion of internet usage, due to growing trends and adverse impacts of cybercrimes, which include disrupting businesses, corrupting or altering sensitive data, stealing or exposing information, and illegally accessing a computer network. As a popular way, different kinds of firewalls, antivirus systems, and Intrusion Detection Systems (IDS) have been introduced to protect a network from such attacks. Recently, Machine Learning (ML), including Deep Learning (DL) based autonomous systems, have been state-of-the-art in cyber security, along with their drastic growth and superior performance. This study aims to develop a novel IDS system that gives more attention to classifying attack cases correctly and categorizes attacks into subclass levels by proposing a two-step process with a cascaded framework. The proposed framework recognizes the attacks using one ML model and classifies them into subclass levels using the other ML model in successive operations. The most challenging part is to train both models with unbalanced cases of attacks and non-attacks in the datasets, which is overcome by proposing a data augmentation technique. Precisely, limited attack samples of the dataset are augmented in the training set to learn the attack cases properly. Finally, the proposed framework is implemented with NN, the most popular ML model, and evaluated with the NSL-KDD dataset by conducting a rigorous analysis of each subclass emphasizing the major attack class. The proficiency of the proposed cascaded approach with data augmentation is compared with the other three models: the cascaded model without data augmentation and the standard single NN model with and without the data augmentation technique. Experimental results on the NSL-KDD dataset have revealed the proposed method as an effective IDS system and outperformed existing state-of-the-art ML models.