Md. Monzurul Amin Ifath, Miguel Neves, Israat Haque
{"title":"Fast Prototyping of Distributed Stream Processing Applications with stream2gym","authors":"Md. Monzurul Amin Ifath, Miguel Neves, Israat Haque","doi":"arxiv-2409.00577","DOIUrl":null,"url":null,"abstract":"Stream processing applications have been widely adopted due to real-time data\nanalytics demands, e.g., fraud detection, video analytics, IoT applications.\nUnfortunately, prototyping and testing these applications is still a cumbersome\nprocess for developers that usually requires an expensive testbed and deep\nmulti-disciplinary expertise, including in areas such as networking,\ndistributed systems, and data engineering. As a result, it takes a long time to\ndeploy stream processing applications into production and yet users face\nseveral correctness and performance issues. In this paper, we present\nstream2gym, a tool for the fast prototyping of large-scale distributed stream\nprocessing applications. stream2gym builds on Mininet, a widely adopted network\nemulation platform, and provides a high-level interface to enable developers to\neasily test their applications under various operating conditions. We\ndemonstrate the benefits of stream2gym by prototyping and testing several\napplications as well as reproducing key findings from prior research work in\nvideo analytics and network traffic monitoring. Moreover, we show stream2gym\npresents accurate results compared to a hardware testbed while consuming a\nsmall amount of resources (enough to be supported in a single commodity laptop\neven when emulating a dozen of processing nodes).","PeriodicalId":501280,"journal":{"name":"arXiv - CS - Networking and Internet Architecture","volume":"17 1","pages":""},"PeriodicalIF":0.0000,"publicationDate":"2024-09-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"arXiv - CS - Networking and Internet Architecture","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/arxiv-2409.00577","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 0
Abstract
Stream processing applications have been widely adopted due to real-time data
analytics demands, e.g., fraud detection, video analytics, IoT applications.
Unfortunately, prototyping and testing these applications is still a cumbersome
process for developers that usually requires an expensive testbed and deep
multi-disciplinary expertise, including in areas such as networking,
distributed systems, and data engineering. As a result, it takes a long time to
deploy stream processing applications into production and yet users face
several correctness and performance issues. In this paper, we present
stream2gym, a tool for the fast prototyping of large-scale distributed stream
processing applications. stream2gym builds on Mininet, a widely adopted network
emulation platform, and provides a high-level interface to enable developers to
easily test their applications under various operating conditions. We
demonstrate the benefits of stream2gym by prototyping and testing several
applications as well as reproducing key findings from prior research work in
video analytics and network traffic monitoring. Moreover, we show stream2gym
presents accurate results compared to a hardware testbed while consuming a
small amount of resources (enough to be supported in a single commodity laptop
even when emulating a dozen of processing nodes).