{"title":"Workflow Transformation for Real-Time Big Data Processing","authors":"Yuji Ishizuka, Wuhui Chen, Incheon Paik","doi":"10.1109/BigDataCongress.2016.47","DOIUrl":null,"url":null,"abstract":"With the explosion of big data, processing and analyzing large numbers of continuous data streams in real-time, such as social media stream, sensor data streams, log streams, stock exchanges streams, etc., has become a crucial requirement for many scientific and industrial applications in recent years. Increased volume of streaming data as well as the demand for more complex real-time analytics require for execution of processing pipelines among heterogeneous event processing engines as a workflow. In this paper, we propose a workflow transformation for cost minimization in real-time big data processing on the heterogeneous systems. We first give the definition of stream-based workflow, and then we define eight different patterns as rules for workflow transformation, next, we give our workflow transformation algorithm based on our designed rules. Finally, our experiment shows that our proposed workflow transformation method can reduce the communication and computation cost effectively.","PeriodicalId":407471,"journal":{"name":"2016 IEEE International Congress on Big Data (BigData Congress)","volume":"34 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2016-01-15","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"11","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"2016 IEEE International Congress on Big Data (BigData Congress)","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/BigDataCongress.2016.47","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 11
Abstract
With the explosion of big data, processing and analyzing large numbers of continuous data streams in real-time, such as social media stream, sensor data streams, log streams, stock exchanges streams, etc., has become a crucial requirement for many scientific and industrial applications in recent years. Increased volume of streaming data as well as the demand for more complex real-time analytics require for execution of processing pipelines among heterogeneous event processing engines as a workflow. In this paper, we propose a workflow transformation for cost minimization in real-time big data processing on the heterogeneous systems. We first give the definition of stream-based workflow, and then we define eight different patterns as rules for workflow transformation, next, we give our workflow transformation algorithm based on our designed rules. Finally, our experiment shows that our proposed workflow transformation method can reduce the communication and computation cost effectively.