P2P Network Traffic Identification Based on Random Forest Algorithm

Yajun Hou
{"title":"P2P Network Traffic Identification Based on Random Forest Algorithm","authors":"Yajun Hou","doi":"10.4304/jnw.9.9.2456-2461","DOIUrl":null,"url":null,"abstract":"With the rapid development of computer technique in the past decades, the emergence of P2P techniqueprompts the network computing model evolving from centralized network to distributed network. Although P2P technique has brought tremendous changes to the network technique, P2P technique also exposes a lot of problems during its implementation. If we can manage the P2P network traffic effectively,e.g. identifies and controls its traffic and distinguishes its services, then it will make great sense for research on improving the performance of network service and use efficiency. However,the traditional approaches have shown a great lack of adaptability in dealing with samples which contain heterogeneous information.large scale of samples,unnormalizeddata or uneven data distributed in high dimensional feature space. This paper is based on therelated researches, to overcomethe limitations and shortcomings of current network traffic identification; we explored network traffic identification and came up with an approach of network traffic identification based on random forest. This paper uses campus network of North China University of Water Resources and Electric Power and takes its outlet flow as sample data to experiment. The result shows that random forest is suitable for large scale of data situation, complex dimensional situation, data contain lots of heterogeneous information etc. Additionally, random forest algorithm provide broad application prospects and rich design ideas for machine learning in feature extraction, multiple class object detection and pattern recognition fields","PeriodicalId":14643,"journal":{"name":"J. Networks","volume":"28 1","pages":"2456-2461"},"PeriodicalIF":0.0000,"publicationDate":"2014-04-09","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"3","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"J. Networks","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.4304/jnw.9.9.2456-2461","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 3

Abstract

With the rapid development of computer technique in the past decades, the emergence of P2P techniqueprompts the network computing model evolving from centralized network to distributed network. Although P2P technique has brought tremendous changes to the network technique, P2P technique also exposes a lot of problems during its implementation. If we can manage the P2P network traffic effectively,e.g. identifies and controls its traffic and distinguishes its services, then it will make great sense for research on improving the performance of network service and use efficiency. However,the traditional approaches have shown a great lack of adaptability in dealing with samples which contain heterogeneous information.large scale of samples,unnormalizeddata or uneven data distributed in high dimensional feature space. This paper is based on therelated researches, to overcomethe limitations and shortcomings of current network traffic identification; we explored network traffic identification and came up with an approach of network traffic identification based on random forest. This paper uses campus network of North China University of Water Resources and Electric Power and takes its outlet flow as sample data to experiment. The result shows that random forest is suitable for large scale of data situation, complex dimensional situation, data contain lots of heterogeneous information etc. Additionally, random forest algorithm provide broad application prospects and rich design ideas for machine learning in feature extraction, multiple class object detection and pattern recognition fields
基于随机森林算法的P2P网络流量识别
近几十年来,随着计算机技术的飞速发展,P2P技术的出现促使网络计算模式从集中式网络向分布式网络演变。虽然P2P技术给网络技术带来了巨大的变化,但P2P技术在实现过程中也暴露出许多问题。如果我们能够有效地管理P2P网络流量,例如:对其流量进行识别和控制,对其业务进行区分,对提高网络业务性能和使用效率的研究具有重要意义。然而,传统的方法在处理包含异构信息的样本时表现出极大的适应性不足。样本规模大,数据未归一化或数据分布在高维特征空间中不均匀。本文在相关研究的基础上,克服了当前网络流量识别的局限性和不足;对网络流量识别进行了研究,提出了一种基于随机森林的网络流量识别方法。本文以华北水利电力大学校园网为例,以其出水口流量为样本数据进行实验。结果表明,随机森林适用于大数据规模、复杂维数、数据包含大量异构信息等情况。此外,随机森林算法在特征提取、多类目标检测和模式识别等领域为机器学习提供了广阔的应用前景和丰富的设计思路
本文章由计算机程序翻译,如有差异,请以英文原文为准。
求助全文
约1分钟内获得全文 求助全文
来源期刊
自引率
0.00%
发文量
0
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
确定
请完成安全验证×
copy
已复制链接
快去分享给好友吧!
我知道了
右上角分享
点击右上角分享
0
联系我们:info@booksci.cn Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。 Copyright © 2023 布克学术 All rights reserved.
京ICP备2023020795号-1
ghs 京公网安备 11010802042870号
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术官方微信