{"title":"Research on stream processing engine and benchmarking framework","authors":"Qionghua Le, Mingang Chen, Wenjie Chen","doi":"10.1109/ICNSC55942.2022.10004188","DOIUrl":null,"url":null,"abstract":"Stream computing engine is an important part of big data system, and benchmarking is one of the main means to measure the engine's performance. In this paper, we compare the differences between two engines, Spark Streaming and Flink, in stream processing technologies. Then the open source benchmarking frameworks supporting stream processing and their respective characteristics are studied, and the HiBench testing framework is selected to test the two stream processing engines. The test results show that Flink is better than Spark Streaming in terms of performance in shuffle, stateful computation and windowed computation.","PeriodicalId":230499,"journal":{"name":"2022 IEEE International Conference on Networking, Sensing and Control (ICNSC)","volume":"38 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2022-12-15","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"2022 IEEE International Conference on Networking, Sensing and Control (ICNSC)","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/ICNSC55942.2022.10004188","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 0
Abstract
Stream computing engine is an important part of big data system, and benchmarking is one of the main means to measure the engine's performance. In this paper, we compare the differences between two engines, Spark Streaming and Flink, in stream processing technologies. Then the open source benchmarking frameworks supporting stream processing and their respective characteristics are studied, and the HiBench testing framework is selected to test the two stream processing engines. The test results show that Flink is better than Spark Streaming in terms of performance in shuffle, stateful computation and windowed computation.