{"title":"Internet Traffic Detection using Naïve Bayes and K-Nearest Neighbors (KNN) algorithm","authors":"M. Dixit, R. Sharma, Saniya Shaikh, Krutika Muley","doi":"10.1109/ICCS45141.2019.9065655","DOIUrl":null,"url":null,"abstract":"Growth of internet has led to rise in number of users and its usage. Despite its advantages, exponential rise in internet usage has resulted in excess data flow over the system flooding the internet. To maintain quality of service and speed of internet along with ensuring data security as well as preventing data misuse, analysis of the internet data becomes essential. Analysis of the dataflow involves characterizing it into different types. This can be done by inspecting the packets either on basis of port numbers, payload information or statistical features. This paper aims to discuss the analysis of internet traffic using statistical features such as interpacket arrival time, time to live and number of packets helping us prevent invasion of packet information. This helps us protect user’s privacy. To automate the process of categorizing internet traffic, machine learning based supervised classification techniques namely Naive Bayes and K Nearest Neighbors are implemented. Experiments to obtain highest accuracy in classifying internet traffic on basis of transaction protocol were performed. The dataset used is UNSW-NB. The results show that classification using K-Nearest Neighbors algorithm gives accuracy of 85% whereas maximum accuracy achieved using Naïve Bayes algorithm is 54%.","PeriodicalId":433980,"journal":{"name":"2019 International Conference on Intelligent Computing and Control Systems (ICCS)","volume":"48 12","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2019-05-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"10","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"2019 International Conference on Intelligent Computing and Control Systems (ICCS)","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/ICCS45141.2019.9065655","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 10
Abstract
Growth of internet has led to rise in number of users and its usage. Despite its advantages, exponential rise in internet usage has resulted in excess data flow over the system flooding the internet. To maintain quality of service and speed of internet along with ensuring data security as well as preventing data misuse, analysis of the internet data becomes essential. Analysis of the dataflow involves characterizing it into different types. This can be done by inspecting the packets either on basis of port numbers, payload information or statistical features. This paper aims to discuss the analysis of internet traffic using statistical features such as interpacket arrival time, time to live and number of packets helping us prevent invasion of packet information. This helps us protect user’s privacy. To automate the process of categorizing internet traffic, machine learning based supervised classification techniques namely Naive Bayes and K Nearest Neighbors are implemented. Experiments to obtain highest accuracy in classifying internet traffic on basis of transaction protocol were performed. The dataset used is UNSW-NB. The results show that classification using K-Nearest Neighbors algorithm gives accuracy of 85% whereas maximum accuracy achieved using Naïve Bayes algorithm is 54%.