Adding Velocity to BigBench

Todor Ivanov, Patrick Bedué, A. Ghazal, R. Zicari
{"title":"Adding Velocity to BigBench","authors":"Todor Ivanov, Patrick Bedué, A. Ghazal, R. Zicari","doi":"10.1145/3209950.3209956","DOIUrl":null,"url":null,"abstract":"BigBench standardized as TPCx-BB is a popular application benchmark that targets Big Data storage and processing systems. BigBench V2 addresses some of the BigBench limitations by introducing a new simplified data model, semi-structured web logs in JSON file format and new queries mandating late binding. However, it still covers only batch processing workloads and the Big Data velocity characteristic is not addressed. This work extends the BigBench V2 benchmark with a data streaming component that simulates typical statistical and predictive analytics queries in a retail business scenario. Our approach is to preserve the existing BigBench design and introduce a new streaming component that supports two data streaming modes: active and passive. In active mode, the data stream generation and processing happen in parallel, whereas in passive mode, the data stream is pre-generated in advance before the actual stream processing. The stream workload consists of five queries inspired by the existing 30 BigBench queries. To validate the proposed streaming extension, the two streaming modes were implemented and tested using Kafka and Spark Streaming. The experimental results prove the feasibility of our benchmark design. Finally, we outline design challenges and future plans for improving the proposed BigBench extension.","PeriodicalId":436501,"journal":{"name":"Proceedings of the Workshop on Testing Database Systems","volume":"212 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2018-06-15","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"3","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Proceedings of the Workshop on Testing Database Systems","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1145/3209950.3209956","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 3

Abstract

BigBench standardized as TPCx-BB is a popular application benchmark that targets Big Data storage and processing systems. BigBench V2 addresses some of the BigBench limitations by introducing a new simplified data model, semi-structured web logs in JSON file format and new queries mandating late binding. However, it still covers only batch processing workloads and the Big Data velocity characteristic is not addressed. This work extends the BigBench V2 benchmark with a data streaming component that simulates typical statistical and predictive analytics queries in a retail business scenario. Our approach is to preserve the existing BigBench design and introduce a new streaming component that supports two data streaming modes: active and passive. In active mode, the data stream generation and processing happen in parallel, whereas in passive mode, the data stream is pre-generated in advance before the actual stream processing. The stream workload consists of five queries inspired by the existing 30 BigBench queries. To validate the proposed streaming extension, the two streaming modes were implemented and tested using Kafka and Spark Streaming. The experimental results prove the feasibility of our benchmark design. Finally, we outline design challenges and future plans for improving the proposed BigBench extension.
增加BigBench的速度
BigBench被标准化为TPCx-BB,是一种针对大数据存储和处理系统的流行应用基准。BigBench V2通过引入新的简化数据模型、JSON文件格式的半结构化web日志和新的查询强制后期绑定,解决了BigBench的一些限制。然而,它仍然只涵盖批处理工作负载,并且没有解决大数据速度特性。这项工作扩展了BigBench V2基准测试,使用数据流组件模拟零售业务场景中的典型统计和预测分析查询。我们的方法是保留现有的BigBench设计,并引入一个新的流组件,支持两种数据流模式:主动和被动。在主动模式下,数据流的生成和处理是并行进行的,而在被动模式下,数据流是在实际处理之前预先生成的。流工作负载由现有的30个BigBench查询启发的5个查询组成。为了验证提议的流扩展,使用Kafka和Spark streaming实现和测试了两种流模式。实验结果证明了基准设计的可行性。最后,我们概述了改进BigBench扩展的设计挑战和未来计划。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
求助全文
约1分钟内获得全文 求助全文
来源期刊
自引率
0.00%
发文量
0
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
确定
请完成安全验证×
copy
已复制链接
快去分享给好友吧!
我知道了
右上角分享
点击右上角分享
0
联系我们:info@booksci.cn Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。 Copyright © 2023 布克学术 All rights reserved.
京ICP备2023020795号-1
ghs 京公网安备 11010802042870号
Book学术文献互助
Book学术文献互助群
群 号:604180095
Book学术官方微信