Investigation of network intrusion detection using data visualization methods

2018 59th International Scientific Conference on Information Technology and Management Science of Riga Technical University (ITMS) Pub Date : 2018-10-01 DOI:10.1109/ITMS.2018.8552977

V. Bulavas

{"title":"Investigation of network intrusion detection using data visualization methods","authors":"V. Bulavas","doi":"10.1109/ITMS.2018.8552977","DOIUrl":null,"url":null,"abstract":"There are numerous sources of network intrusion detection data, for example, network traffic, system host logs, user activity, such as mail or browsing, use of smart devices and similar. All these data come in big volume, velocity and variety. Analysis of such data is essential for making anomaly detection and intrusion prevention decisions. Common data processing steps, following the acquisition of data and pre-processing, are data reduction and projection. These steps help to reduce the number of dimensions, and visualization, which enables observation of distinct features in real time. Projection and visualisation, further discussed in this paper are required for better understanding of contained intrusion phenomena, such as data theft, malware activity or hacking attempts. Machine learning enables reduction of data complexity, supports discovery of anomalies and speedups related decision-making. Visualization helps further understand data by elaborating the well-hidden data properties and features. Numerous methods of multidimensional data visualization are currently available to assist data scientist or information security analyst in the broad landscape of intrusion data analysis. For simplicity, visualization methods in this paper are categorized as direct, linear projection, non-linear projection and other. Attention is drawn to linear projection, in particular principal components analysis, helping to select the most informative dimensions of the data. Principal Component analysis provide indication of anomalies of network traffic. Decision Tree method is utilized to provide decision criteria for anomaly recognition as an intrusion. Investigation in this research demonstrates that combination of PCA and Decision Tree methods allows classification of intrusions such as Smurf, Satan, Neptune, Portsweep, Ppsweep with probabilities higher than 95% with depth of tree set to 4 and number of PCA components set to 10. Nevertheless, Nmap and Teardrop intrusions are classified purely, therefore deeper Decision Tree is needed to increase classification accuracy.","PeriodicalId":367060,"journal":{"name":"2018 59th International Scientific Conference on Information Technology and Management Science of Riga Technical University (ITMS)","volume":"120 1‐2","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2018-10-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"2","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"2018 59th International Scientific Conference on Information Technology and Management Science of Riga Technical University (ITMS)","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/ITMS.2018.8552977","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}

引用次数: 2

Abstract

There are numerous sources of network intrusion detection data, for example, network traffic, system host logs, user activity, such as mail or browsing, use of smart devices and similar. All these data come in big volume, velocity and variety. Analysis of such data is essential for making anomaly detection and intrusion prevention decisions. Common data processing steps, following the acquisition of data and pre-processing, are data reduction and projection. These steps help to reduce the number of dimensions, and visualization, which enables observation of distinct features in real time. Projection and visualisation, further discussed in this paper are required for better understanding of contained intrusion phenomena, such as data theft, malware activity or hacking attempts. Machine learning enables reduction of data complexity, supports discovery of anomalies and speedups related decision-making. Visualization helps further understand data by elaborating the well-hidden data properties and features. Numerous methods of multidimensional data visualization are currently available to assist data scientist or information security analyst in the broad landscape of intrusion data analysis. For simplicity, visualization methods in this paper are categorized as direct, linear projection, non-linear projection and other. Attention is drawn to linear projection, in particular principal components analysis, helping to select the most informative dimensions of the data. Principal Component analysis provide indication of anomalies of network traffic. Decision Tree method is utilized to provide decision criteria for anomaly recognition as an intrusion. Investigation in this research demonstrates that combination of PCA and Decision Tree methods allows classification of intrusions such as Smurf, Satan, Neptune, Portsweep, Ppsweep with probabilities higher than 95% with depth of tree set to 4 and number of PCA components set to 10. Nevertheless, Nmap and Teardrop intrusions are classified purely, therefore deeper Decision Tree is needed to increase classification accuracy.

查看原文本刊更多论文

基于数据可视化方法的网络入侵检测研究

网络入侵检测数据有许多来源，例如，网络流量、系统主机日志、用户活动(如邮件或浏览)、智能设备的使用等。所有这些数据量大、速度快、种类多。分析这些数据对于进行异常检测和入侵防御决策至关重要。在数据采集和预处理之后，常见的数据处理步骤是数据约简和投影。这些步骤有助于减少维数和可视化，从而能够实时观察不同的特征。投影和可视化，在本文中进一步讨论，需要更好地理解包含的入侵现象，如数据盗窃，恶意软件活动或黑客企图。机器学习可以降低数据复杂性，支持发现异常并加快相关决策。可视化通过详细说明隐藏良好的数据属性和特征，有助于进一步理解数据。目前有许多多维数据可视化方法可用于协助数据科学家或信息安全分析师进行入侵数据分析。为简便起见，本文将可视化方法分为直接投影、线性投影、非线性投影等。注意线性投影，特别是主成分分析，有助于选择数据中信息量最大的维度。主成分分析提供了网络流量异常的指示。利用决策树方法为异常识别作为入侵提供决策准则。本研究的研究表明，结合主成分分析和决策树方法，可以对Smurf、Satan、Neptune、Portsweep、Ppsweep等入侵进行分类，在树深度为4、主成分个数为10的情况下，分类概率高于95%。然而，Nmap和Teardrop入侵是纯粹的分类，因此需要更深入的决策树来提高分类精度。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

求助全文

约1分钟内获得全文求助全文

来源期刊

2018 59th International Scientific Conference on Information Technology and Management Science of Riga Technical University (ITMS)

自引率

0.00%

发文量