A Deep Recurrent Neural Network Based Predictive Control Framework for Reliable Distributed Stream Data Processing

2019 IEEE International Parallel and Distributed Processing Symposium (IPDPS) Pub Date : 2019-05-20 DOI:10.1109/IPDPS.2019.00036

Jielong Xu, Jian Tang, Zhiyuan Xu, Chengxiang Yin, K. Kwiat, C. Kamhoua

{"title":"A Deep Recurrent Neural Network Based Predictive Control Framework for Reliable Distributed Stream Data Processing","authors":"Jielong Xu, Jian Tang, Zhiyuan Xu, Chengxiang Yin, K. Kwiat, C. Kamhoua","doi":"10.1109/IPDPS.2019.00036","DOIUrl":null,"url":null,"abstract":"In this paper, we present design, implementation and evaluation of a novel predictive control framework to enable reliable distributed stream data processing, which features a Deep Recurrent Neural Network (DRNN) model for performance prediction, and dynamic grouping for flexible control. Specifically, we present a novel DRNN model, which makes accurate performance prediction with careful consideration for interference of co-located worker processes, according to multilevel runtime statistics. Moreover, we design a new grouping method, dynamic grouping, which can distribute/re-distribute data tuples to downstream tasks according to any given split ratio on the fly. So it can be used to re-direct data tuples to bypass misbehaving workers. We implemented the proposed framework based on a widely used Distributed Stream Data Processing System (DSDPS), Storm. For validation and performance evaluation, we developed two representative stream data processing applications: Windowed URL Count and Continuous Queries. Extensive experimental results show: 1) The proposed DRNN model outperforms widely used baseline solutions, ARIMA and SVR, in terms of prediction accuracy; 2) dynamic grouping works as expected; and 3) the proposed framework enhances reliability by offering minor performance degradation with misbehaving workers.","PeriodicalId":403406,"journal":{"name":"2019 IEEE International Parallel and Distributed Processing Symposium (IPDPS)","volume":"26 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2019-05-20","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"1","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"2019 IEEE International Parallel and Distributed Processing Symposium (IPDPS)","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/IPDPS.2019.00036","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}

引用次数: 1

Abstract

In this paper, we present design, implementation and evaluation of a novel predictive control framework to enable reliable distributed stream data processing, which features a Deep Recurrent Neural Network (DRNN) model for performance prediction, and dynamic grouping for flexible control. Specifically, we present a novel DRNN model, which makes accurate performance prediction with careful consideration for interference of co-located worker processes, according to multilevel runtime statistics. Moreover, we design a new grouping method, dynamic grouping, which can distribute/re-distribute data tuples to downstream tasks according to any given split ratio on the fly. So it can be used to re-direct data tuples to bypass misbehaving workers. We implemented the proposed framework based on a widely used Distributed Stream Data Processing System (DSDPS), Storm. For validation and performance evaluation, we developed two representative stream data processing applications: Windowed URL Count and Continuous Queries. Extensive experimental results show: 1) The proposed DRNN model outperforms widely used baseline solutions, ARIMA and SVR, in terms of prediction accuracy; 2) dynamic grouping works as expected; and 3) the proposed framework enhances reliability by offering minor performance degradation with misbehaving workers.

查看原文本刊更多论文

基于深度递归神经网络的可靠分布式流数据处理预测控制框架

在本文中，我们提出了一种新的预测控制框架的设计、实现和评估，以实现可靠的分布式流数据处理，该框架具有用于性能预测的深度递归神经网络(DRNN)模型和用于灵活控制的动态分组。具体而言，我们提出了一种新的DRNN模型，该模型根据多层运行时统计数据，在仔细考虑同址工作进程干扰的情况下，进行准确的性能预测。此外，我们还设计了一种新的分组方法——动态分组，该方法可以动态地将数据元组按照给定的分割比例分配/重新分配给下游任务。因此，它可以用来重定向数据元组，以绕过行为不端的工作器。我们基于广泛使用的分布式流数据处理系统(DSDPS) Storm实现了提出的框架。为了验证和性能评估，我们开发了两个代表性的流数据处理应用程序:窗口URL计数和连续查询。大量的实验结果表明:1)所提出的DRNN模型在预测精度方面优于广泛使用的基线解决方案ARIMA和SVR;2)动态分组工作正常;3)提出的框架通过对行为不端的工人提供轻微的性能降低来提高可靠性。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

求助全文

约1分钟内获得全文求助全文

来源期刊

2019 IEEE International Parallel and Distributed Processing Symposium (IPDPS)

自引率

0.00%

发文量