{"title":"Mutual clustered redundancy assisted feature selection for an intrusion detection system","authors":"T. Veeranna, K. Reddi","doi":"10.3233/jhs-220694","DOIUrl":null,"url":null,"abstract":"Intrusion Detection is very important in computer networks because the widespread of internet makes the computers more prone to several cyber-attacks. With this inspiration, a new paradigm called Intrusion Detection System (IDS) has emerged and attained a huge research interest. However, the major challenge in IDS is the presence of redundant and duplicate information that causes a serious computational problem in network traffic classifications. To solve this problem, in this paper, we propose a novel IDS model based on statistical processing techniques and machine learning algorithms. The machine learning algorithms incudes Fuzzy C-means and Support Vector Machine while the statistical processing techniques includes correlation and Joint Entropy. The main purpose of FCM is to cluster the train data and SVM is to classify the traffic connections. Next, the main purpose of correlation is to discover and remove the duplicate connections from every cluster while the Joint entropy is applied for the discovery and removal of duplicate features from every connection. For experimental validation, totally three standard datasets namely KDD Cup 99, NSL-KDD and Kyoto2006+ are considered and the performance is measured through Detection Rate, Precision, F-Score, and accuracy. A five-fold cross validation is done on every dataset by changing the traffic and the obtained average performance is compared with existing methods.","PeriodicalId":54809,"journal":{"name":"Journal of High Speed Networks","volume":"504 1","pages":"257-273"},"PeriodicalIF":0.7000,"publicationDate":"2022-08-16","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Journal of High Speed Networks","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.3233/jhs-220694","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q4","JCRName":"COMPUTER SCIENCE, INFORMATION SYSTEMS","Score":null,"Total":0}
引用次数: 0
Abstract
Intrusion Detection is very important in computer networks because the widespread of internet makes the computers more prone to several cyber-attacks. With this inspiration, a new paradigm called Intrusion Detection System (IDS) has emerged and attained a huge research interest. However, the major challenge in IDS is the presence of redundant and duplicate information that causes a serious computational problem in network traffic classifications. To solve this problem, in this paper, we propose a novel IDS model based on statistical processing techniques and machine learning algorithms. The machine learning algorithms incudes Fuzzy C-means and Support Vector Machine while the statistical processing techniques includes correlation and Joint Entropy. The main purpose of FCM is to cluster the train data and SVM is to classify the traffic connections. Next, the main purpose of correlation is to discover and remove the duplicate connections from every cluster while the Joint entropy is applied for the discovery and removal of duplicate features from every connection. For experimental validation, totally three standard datasets namely KDD Cup 99, NSL-KDD and Kyoto2006+ are considered and the performance is measured through Detection Rate, Precision, F-Score, and accuracy. A five-fold cross validation is done on every dataset by changing the traffic and the obtained average performance is compared with existing methods.
入侵检测在计算机网络中非常重要,因为互联网的普及使计算机更容易受到各种网络攻击。受此启发,一种新的入侵检测系统(IDS)范式应运而生,并引起了广泛的研究兴趣。然而,IDS的主要挑战是冗余和重复信息的存在,这会在网络流量分类中导致严重的计算问题。为了解决这一问题,本文提出了一种基于统计处理技术和机器学习算法的IDS模型。机器学习算法包括模糊c均值和支持向量机,统计处理技术包括相关性和联合熵。FCM的主要目的是对列车数据进行聚类,而SVM的主要目的是对交通连接进行分类。其次,相关性的主要目的是发现和删除每个集群中的重复连接,而联合熵用于发现和删除每个连接中的重复特征。为了进行实验验证,共考虑了KDD Cup 99、NSL-KDD和Kyoto2006+三个标准数据集,并通过Detection Rate、Precision、F-Score和accuracy来衡量性能。通过改变流量对每个数据集进行五次交叉验证,并将得到的平均性能与现有方法进行比较。
期刊介绍:
The Journal of High Speed Networks is an international archival journal, active since 1992, providing a publication vehicle for covering a large number of topics of interest in the high performance networking and communication area. Its audience includes researchers, managers as well as network designers and operators. The main goal will be to provide timely dissemination of information and scientific knowledge.
The journal will publish contributed papers on novel research, survey and position papers on topics of current interest, technical notes, and short communications to report progress on long-term projects. Submissions to the Journal will be refereed consistently with the review process of leading technical journals, based on originality, significance, quality, and clarity.
The journal will publish papers on a number of topics ranging from design to practical experiences with operational high performance/speed networks.