{"title":"Real-time analysis of NetFlow data for generating network traffic statistics using Apache Spark","authors":"Milan Cermák, Tomás Jirsík, Martin Laštovička","doi":"10.1109/NOMS.2016.7502952","DOIUrl":null,"url":null,"abstract":"In this paper, we present a framework for the real-time generation of network traffic statistics on Apache Spark Streaming, a modern distributed stream processing system. Our previous results showed that stream processing systems provide enough throughput to process a large volume of NetFlow data and hence they are suitable for network traffic monitoring. This paper describes the integration of Apache Spark Streaming into a current network monitoring architecture. We prove that it is possible to implement the same basic methods for NetFlow data analysis in the stream processing framework as in the traditional ones. Moreover, our stream processing implementation discovers new information which is not available when using traditional network monitoring approaches.","PeriodicalId":344879,"journal":{"name":"NOMS 2016 - 2016 IEEE/IFIP Network Operations and Management Symposium","volume":"24 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2016-04-25","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"8","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"NOMS 2016 - 2016 IEEE/IFIP Network Operations and Management Symposium","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/NOMS.2016.7502952","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 8
Abstract
In this paper, we present a framework for the real-time generation of network traffic statistics on Apache Spark Streaming, a modern distributed stream processing system. Our previous results showed that stream processing systems provide enough throughput to process a large volume of NetFlow data and hence they are suitable for network traffic monitoring. This paper describes the integration of Apache Spark Streaming into a current network monitoring architecture. We prove that it is possible to implement the same basic methods for NetFlow data analysis in the stream processing framework as in the traditional ones. Moreover, our stream processing implementation discovers new information which is not available when using traditional network monitoring approaches.