Research on Flow Classification Model Based on Similarity and Machine Learning Algorithm

Meigen Huang, Lingling Wu, Xuewang Yuan
{"title":"Research on Flow Classification Model Based on Similarity and Machine Learning Algorithm","authors":"Meigen Huang, Lingling Wu, Xuewang Yuan","doi":"10.1145/3457682.3457687","DOIUrl":null,"url":null,"abstract":"In recent years, with the rapid development of the Internet, complex and diverse applications and network traffic have been generated. At the same time, network encryption technologies and various new network traffic have emerged, which affects the efficiency of the original traffic classification technology. In order to improve the efficiency of traffic classification and reduce the classification time, this paper proposes a network traffic classification model (Cosine similarity and decision tree classification model, CSDT) based on cosine similarity and decision tree algorithm to identify and classify traffic. First, the cosine similarity algorithm is used to judge the similarity of adjacent network traffic, and the network traffic with higher similarity is labeled with a known classification and forwarded. For network traffic with low similarity, the decision tree algorithm is used to classify the related feature values. This model utilizes the characteristics of high similarity in adjacent data streams, and uses similarity algorithms to preprocess network traffic to reduce classification time. The Moore data set publicly available in the field of network traffic classification is used for training and testing, and the results are compared with various machine learning algorithms on the Weka platform. The experimental results show that the model has a good classification accuracy, which greatly reduces the classification time and improves the classification efficiency of network traffic is improved.","PeriodicalId":142045,"journal":{"name":"2021 13th International Conference on Machine Learning and Computing","volume":"32 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2021-02-26","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"2021 13th International Conference on Machine Learning and Computing","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1145/3457682.3457687","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 0

Abstract

In recent years, with the rapid development of the Internet, complex and diverse applications and network traffic have been generated. At the same time, network encryption technologies and various new network traffic have emerged, which affects the efficiency of the original traffic classification technology. In order to improve the efficiency of traffic classification and reduce the classification time, this paper proposes a network traffic classification model (Cosine similarity and decision tree classification model, CSDT) based on cosine similarity and decision tree algorithm to identify and classify traffic. First, the cosine similarity algorithm is used to judge the similarity of adjacent network traffic, and the network traffic with higher similarity is labeled with a known classification and forwarded. For network traffic with low similarity, the decision tree algorithm is used to classify the related feature values. This model utilizes the characteristics of high similarity in adjacent data streams, and uses similarity algorithms to preprocess network traffic to reduce classification time. The Moore data set publicly available in the field of network traffic classification is used for training and testing, and the results are compared with various machine learning algorithms on the Weka platform. The experimental results show that the model has a good classification accuracy, which greatly reduces the classification time and improves the classification efficiency of network traffic is improved.
基于相似度和机器学习算法的流分类模型研究
近年来,随着互联网的快速发展,产生了复杂多样的应用和网络流量。与此同时,网络加密技术和各种新的网络流量不断涌现,影响了原有流分类技术的效率。为了提高流量分类效率,减少分类时间,本文提出了一种基于余弦相似度和决策树算法的网络流量分类模型(余弦相似度和决策树分类模型,CSDT)来对流量进行识别和分类。首先,利用余弦相似度算法判断相邻网络流量的相似度,对相似度较高的网络流量进行已知分类标记并转发。对于相似度较低的网络流量,采用决策树算法对相关特征值进行分类。该模型利用相邻数据流高度相似的特点,利用相似度算法对网络流量进行预处理,减少分类时间。使用网络流量分类领域公开可用的Moore数据集进行训练和测试,并将结果与Weka平台上的各种机器学习算法进行比较。实验结果表明,该模型具有良好的分类精度,大大减少了分类时间,提高了网络流量的分类效率。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
求助全文
约1分钟内获得全文 求助全文
来源期刊
自引率
0.00%
发文量
0
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
确定
请完成安全验证×
copy
已复制链接
快去分享给好友吧!
我知道了
右上角分享
点击右上角分享
0
联系我们:info@booksci.cn Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。 Copyright © 2023 布克学术 All rights reserved.
京ICP备2023020795号-1
ghs 京公网安备 11010802042870号
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术官方微信