IP Traffic Classification of 4G Network using Machine Learning Techniques

2021 5th International Conference on Computing Methodologies and Communication (ICCMC) Pub Date : 2021-04-08 DOI:10.1109/ICCMC51019.2021.9418397

Rahul, Amit Gupta, A. Raj, Mayank Arora

{"title":"IP Traffic Classification of 4G Network using Machine Learning Techniques","authors":"Rahul, Amit Gupta, A. Raj, Mayank Arora","doi":"10.1109/ICCMC51019.2021.9418397","DOIUrl":null,"url":null,"abstract":"In today's world, the number of internet services and users is increasing rapidly. This leads to a significant rise in the internet traffic. Thus, the task of classifying IP traffic is essential for internet service providers or ISP, as well as various government and private organizations in order to have better network management and security. IP traffic classification involves identification of user activity using network traffic flowing through the system. This will also help in enhancing the performance of the network. The use of traditional IP traffic classification mechanisms which are based on inspection of packet payload and port numbers has decreased drastically because there are many internet applications nowadays which use port numbers which are dynamic in nature rather than well-known port numbers. Also, there are several encryption techniques nowadays due to which the inspection of packet payload is hindered. Presently, various machine learning techniques are generally used for classifying IP traffic. However, not much research has been conducted for the classification of IP traffic for a 4G network. During this research, we developed a new dataset by capturing packets of real-time internet traffic data of a 4G network using a tool named Wireshark. After that, we extracted the inferred features of the captured packets by using a python script. Then we applied five machine learning models, i.e., Decision Tree, Support Vector Machines, K Nearest Neighbours, Random Forest, and Naive Bayes for classifying IP traffic. It was observed that Random Forest gave the best accuracy of approximately 87%.","PeriodicalId":131747,"journal":{"name":"2021 5th International Conference on Computing Methodologies and Communication (ICCMC)","volume":"18 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2021-04-08","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"2","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"2021 5th International Conference on Computing Methodologies and Communication (ICCMC)","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/ICCMC51019.2021.9418397","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}

引用次数: 2

Abstract

In today's world, the number of internet services and users is increasing rapidly. This leads to a significant rise in the internet traffic. Thus, the task of classifying IP traffic is essential for internet service providers or ISP, as well as various government and private organizations in order to have better network management and security. IP traffic classification involves identification of user activity using network traffic flowing through the system. This will also help in enhancing the performance of the network. The use of traditional IP traffic classification mechanisms which are based on inspection of packet payload and port numbers has decreased drastically because there are many internet applications nowadays which use port numbers which are dynamic in nature rather than well-known port numbers. Also, there are several encryption techniques nowadays due to which the inspection of packet payload is hindered. Presently, various machine learning techniques are generally used for classifying IP traffic. However, not much research has been conducted for the classification of IP traffic for a 4G network. During this research, we developed a new dataset by capturing packets of real-time internet traffic data of a 4G network using a tool named Wireshark. After that, we extracted the inferred features of the captured packets by using a python script. Then we applied five machine learning models, i.e., Decision Tree, Support Vector Machines, K Nearest Neighbours, Random Forest, and Naive Bayes for classifying IP traffic. It was observed that Random Forest gave the best accuracy of approximately 87%.

查看原文本刊更多论文

基于机器学习技术的4G网络IP流量分类

当今世界，互联网服务和用户的数量正在迅速增加。这导致了互联网流量的显著增加。因此，对IP流量进行分类的任务对于互联网服务提供商或ISP以及各种政府和私人组织来说是必不可少的，以便更好地进行网络管理和安全。IP流量分类涉及使用流经系统的网络流量来识别用户活动。这也将有助于提高网络的性能。传统的IP流量分类机制基于对数据包有效载荷和端口号的检测，由于现在有许多互联网应用程序使用的是动态的端口号，而不是众所周知的端口号，因此这种机制的使用急剧减少。此外，目前有几种加密技术，由于这些技术阻碍了对数据包有效载荷的检查。目前，各种机器学习技术通常用于对IP流量进行分类。然而，关于4G网络的IP流量分类的研究并不多。在这项研究中，我们通过使用一个名为Wireshark的工具捕获4G网络的实时互联网流量数据包，开发了一个新的数据集。之后，我们使用python脚本提取捕获数据包的推断特征。然后应用决策树、支持向量机、K近邻、随机森林和朴素贝叶斯五种机器学习模型对IP流量进行分类。观察到随机森林给出了大约87%的最佳准确率。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

求助全文

约1分钟内获得全文求助全文

来源期刊

2021 5th International Conference on Computing Methodologies and Communication (ICCMC)

自引率

0.00%

发文量