A Deep Recurrent Neural Network Based Predictive Control Framework for Reliable Distributed Stream Data Processing

Jielong Xu, Jian Tang, Zhiyuan Xu, Chengxiang Yin, K. Kwiat, C. Kamhoua
{"title":"A Deep Recurrent Neural Network Based Predictive Control Framework for Reliable Distributed Stream Data Processing","authors":"Jielong Xu, Jian Tang, Zhiyuan Xu, Chengxiang Yin, K. Kwiat, C. Kamhoua","doi":"10.1109/IPDPS.2019.00036","DOIUrl":null,"url":null,"abstract":"In this paper, we present design, implementation and evaluation of a novel predictive control framework to enable reliable distributed stream data processing, which features a Deep Recurrent Neural Network (DRNN) model for performance prediction, and dynamic grouping for flexible control. Specifically, we present a novel DRNN model, which makes accurate performance prediction with careful consideration for interference of co-located worker processes, according to multilevel runtime statistics. Moreover, we design a new grouping method, dynamic grouping, which can distribute/re-distribute data tuples to downstream tasks according to any given split ratio on the fly. So it can be used to re-direct data tuples to bypass misbehaving workers. We implemented the proposed framework based on a widely used Distributed Stream Data Processing System (DSDPS), Storm. For validation and performance evaluation, we developed two representative stream data processing applications: Windowed URL Count and Continuous Queries. Extensive experimental results show: 1) The proposed DRNN model outperforms widely used baseline solutions, ARIMA and SVR, in terms of prediction accuracy; 2) dynamic grouping works as expected; and 3) the proposed framework enhances reliability by offering minor performance degradation with misbehaving workers.","PeriodicalId":403406,"journal":{"name":"2019 IEEE International Parallel and Distributed Processing Symposium (IPDPS)","volume":"26 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2019-05-20","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"1","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"2019 IEEE International Parallel and Distributed Processing Symposium (IPDPS)","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/IPDPS.2019.00036","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 1

Abstract

In this paper, we present design, implementation and evaluation of a novel predictive control framework to enable reliable distributed stream data processing, which features a Deep Recurrent Neural Network (DRNN) model for performance prediction, and dynamic grouping for flexible control. Specifically, we present a novel DRNN model, which makes accurate performance prediction with careful consideration for interference of co-located worker processes, according to multilevel runtime statistics. Moreover, we design a new grouping method, dynamic grouping, which can distribute/re-distribute data tuples to downstream tasks according to any given split ratio on the fly. So it can be used to re-direct data tuples to bypass misbehaving workers. We implemented the proposed framework based on a widely used Distributed Stream Data Processing System (DSDPS), Storm. For validation and performance evaluation, we developed two representative stream data processing applications: Windowed URL Count and Continuous Queries. Extensive experimental results show: 1) The proposed DRNN model outperforms widely used baseline solutions, ARIMA and SVR, in terms of prediction accuracy; 2) dynamic grouping works as expected; and 3) the proposed framework enhances reliability by offering minor performance degradation with misbehaving workers.
基于深度递归神经网络的可靠分布式流数据处理预测控制框架
在本文中,我们提出了一种新的预测控制框架的设计、实现和评估,以实现可靠的分布式流数据处理,该框架具有用于性能预测的深度递归神经网络(DRNN)模型和用于灵活控制的动态分组。具体而言,我们提出了一种新的DRNN模型,该模型根据多层运行时统计数据,在仔细考虑同址工作进程干扰的情况下,进行准确的性能预测。此外,我们还设计了一种新的分组方法——动态分组,该方法可以动态地将数据元组按照给定的分割比例分配/重新分配给下游任务。因此,它可以用来重定向数据元组,以绕过行为不端的工作器。我们基于广泛使用的分布式流数据处理系统(DSDPS) Storm实现了提出的框架。为了验证和性能评估,我们开发了两个代表性的流数据处理应用程序:窗口URL计数和连续查询。大量的实验结果表明:1)所提出的DRNN模型在预测精度方面优于广泛使用的基线解决方案ARIMA和SVR;2)动态分组工作正常;3)提出的框架通过对行为不端的工人提供轻微的性能降低来提高可靠性。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
求助全文
约1分钟内获得全文 求助全文
来源期刊
自引率
0.00%
发文量
0
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
确定
请完成安全验证×
copy
已复制链接
快去分享给好友吧!
我知道了
右上角分享
点击右上角分享
0
联系我们:info@booksci.cn Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。 Copyright © 2023 布克学术 All rights reserved.
京ICP备2023020795号-1
ghs 京公网安备 11010802042870号
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术官方微信