网络传输标志基于k近邻的数据亲和力分类

IF 2.1 Q3 MULTIDISCIPLINARY SCIENCES

ARO-THE SCIENTIFIC JOURNAL OF KOYA UNIVERSITY Pub Date : 2022-04-25 DOI:10.14500/aro.10880

N. Aljojo

{"title":"网络传输标志基于k近邻的数据亲和力分类","authors":"N. Aljojo","doi":"10.14500/aro.10880","DOIUrl":null,"url":null,"abstract":"Abstract—This research is concerned with the data generated during a network transmission session to understand how to extract value from the data generated and be able to conduct tasks. Instead of comparing all of the transmission flags for a transmission session at the same time to conduct any analysis, this paper conceptualized the influence of each transmission flag on network-aware applications by comparing the flags one by one on their impact to the application during the transmission session, rather than comparing all of the transmission flags at the same time. The K-nearest neighbor (KNN) type classification was used becauseit is a simple distance-based learning algorithm that remembers earlier training samples and is suitable for taking various flags withtheir effect on application protocols by comparing each new sample with the K-nearest points to make a decision. We used transmission session datasets received from Kaggle for IP flow with 87 features and 3.577.296 instances. We picked 13 features from the datasets and ran them through KNN. RapidMiner was used for the study, and the results of the experiments revealed that the KNN-based model was not only significantly more accurate in categorizing data, but it was also significantly more efficient due to the decreased processing costs.","PeriodicalId":8398,"journal":{"name":"ARO-THE SCIENTIFIC JOURNAL OF KOYA UNIVERSITY","volume":"16 1","pages":""},"PeriodicalIF":2.1000,"publicationDate":"2022-04-25","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"1","resultStr":"{\"title\":\"Network Transmission Flags Data Affinity-based Classification by K-Nearest Neighbor\",\"authors\":\"N. Aljojo\",\"doi\":\"10.14500/aro.10880\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"Abstract—This research is concerned with the data generated during a network transmission session to understand how to extract value from the data generated and be able to conduct tasks. Instead of comparing all of the transmission flags for a transmission session at the same time to conduct any analysis, this paper conceptualized the influence of each transmission flag on network-aware applications by comparing the flags one by one on their impact to the application during the transmission session, rather than comparing all of the transmission flags at the same time. The K-nearest neighbor (KNN) type classification was used becauseit is a simple distance-based learning algorithm that remembers earlier training samples and is suitable for taking various flags withtheir effect on application protocols by comparing each new sample with the K-nearest points to make a decision. We used transmission session datasets received from Kaggle for IP flow with 87 features and 3.577.296 instances. We picked 13 features from the datasets and ran them through KNN. RapidMiner was used for the study, and the results of the experiments revealed that the KNN-based model was not only significantly more accurate in categorizing data, but it was also significantly more efficient due to the decreased processing costs.\",\"PeriodicalId\":8398,\"journal\":{\"name\":\"ARO-THE SCIENTIFIC JOURNAL OF KOYA UNIVERSITY\",\"volume\":\"16 1\",\"pages\":\"\"},\"PeriodicalIF\":2.1000,\"publicationDate\":\"2022-04-25\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"1\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"ARO-THE SCIENTIFIC JOURNAL OF KOYA UNIVERSITY\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/10.14500/aro.10880\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"Q3\",\"JCRName\":\"MULTIDISCIPLINARY SCIENCES\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"ARO-THE SCIENTIFIC JOURNAL OF KOYA UNIVERSITY","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.14500/aro.10880","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q3","JCRName":"MULTIDISCIPLINARY SCIENCES","Score":null,"Total":0}

引用次数: 1

摘要

摘要:本研究关注网络传输过程中产生的数据，了解如何从产生的数据中提取价值并能够执行任务。本文不是同时比较一个传输会话的所有传输标志来进行任何分析，而是通过逐个比较每个传输标志在传输会话期间对应用程序的影响来概念化每个传输标志对网络感知应用程序的影响，而不是同时比较所有传输标志。使用k -最近邻(KNN)类型分类是因为它是一种简单的基于距离的学习算法，可以记住早期的训练样本，并且适合通过比较每个新样本与k -最近点来确定各种标志对应用协议的影响。我们使用从Kaggle接收到的传输会话数据集来分析IP流，其中包含87个特征和3.577.296个实例。我们从数据集中挑选了13个特征，并通过KNN运行它们。使用RapidMiner进行研究，实验结果表明，基于knn的模型不仅在数据分类方面更加准确，而且由于处理成本的降低，效率也显著提高。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

查看原文本刊更多论文

Network Transmission Flags Data Affinity-based Classification by K-Nearest Neighbor

Abstract—This research is concerned with the data generated during a network transmission session to understand how to extract value from the data generated and be able to conduct tasks. Instead of comparing all of the transmission flags for a transmission session at the same time to conduct any analysis, this paper conceptualized the influence of each transmission flag on network-aware applications by comparing the flags one by one on their impact to the application during the transmission session, rather than comparing all of the transmission flags at the same time. The K-nearest neighbor (KNN) type classification was used becauseit is a simple distance-based learning algorithm that remembers earlier training samples and is suitable for taking various flags withtheir effect on application protocols by comparing each new sample with the K-nearest points to make a decision. We used transmission session datasets received from Kaggle for IP flow with 87 features and 3.577.296 instances. We picked 13 features from the datasets and ran them through KNN. RapidMiner was used for the study, and the results of the experiments revealed that the KNN-based model was not only significantly more accurate in categorizing data, but it was also significantly more efficient due to the decreased processing costs.

求助全文

通过发布文献求助，成功后即可免费获取论文全文。去求助

来源期刊

ARO-THE SCIENTIFIC JOURNAL OF KOYA UNIVERSITY MULTIDISCIPLINARY SCIENCES-

自引率

33.30%

发文量

审稿时长

16 weeks