SnappyData: A Hybrid Transactional Analytical Store Built On Spark

Proceedings of the 2016 International Conference on Management of Data Pub Date : 2016-06-26 DOI:10.1145/2882903.2899408

Jags Ramnarayan, Barzan Mozafari, S. Wale, Sudhir Menon, Neeraj Kumar, Hemant Bhanawat, Soubhik Chakraborty, Yogesh S. Mahajan, Rishitesh Mishra, Kishor Bachhav

{"title":"SnappyData: A Hybrid Transactional Analytical Store Built On Spark","authors":"Jags Ramnarayan, Barzan Mozafari, S. Wale, Sudhir Menon, Neeraj Kumar, Hemant Bhanawat, Soubhik Chakraborty, Yogesh S. Mahajan, Rishitesh Mishra, Kishor Bachhav","doi":"10.1145/2882903.2899408","DOIUrl":null,"url":null,"abstract":"In recent years, our customers have expressed frustration in the traditional approach of using a combination of disparate products to handle their streaming, transactional and analytical needs. The common practice of stitching heterogeneous environments in custom ways has caused enormous production woes by increasing development complexity and total cost of ownership. With SnappyData, an open source platform, we propose a unified engine for real-time operational analytics, delivering stream analytics, OLTP and OLAP in a single integrated solution. We realize this platform through a seamless integration of Apache Spark (as a big data computational engine) with GemFire (as an in-memory transactional store with scale-out SQL semantics). In this demonstration, after presenting a few use case scenarios, we exhibit SnappyData as our our in-memory solution for delivering truly interactive analytics (i.e., a couple of seconds), when faced with large data volumes or high velocity streams. We show that SnappyData can exploit state-of-the-art approximate query processing techniques and a variety of data synopses. Finally, we allow the audience to define various high-level accuracy contracts (HAC), to communicate their accuracy requirements with SnappyData in an intuitive fashion.","PeriodicalId":20483,"journal":{"name":"Proceedings of the 2016 International Conference on Management of Data","volume":"10 1","pages":""},"PeriodicalIF":0.0000,"publicationDate":"2016-06-26","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"46","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Proceedings of the 2016 International Conference on Management of Data","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1145/2882903.2899408","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}

引用次数: 46

Abstract

In recent years, our customers have expressed frustration in the traditional approach of using a combination of disparate products to handle their streaming, transactional and analytical needs. The common practice of stitching heterogeneous environments in custom ways has caused enormous production woes by increasing development complexity and total cost of ownership. With SnappyData, an open source platform, we propose a unified engine for real-time operational analytics, delivering stream analytics, OLTP and OLAP in a single integrated solution. We realize this platform through a seamless integration of Apache Spark (as a big data computational engine) with GemFire (as an in-memory transactional store with scale-out SQL semantics). In this demonstration, after presenting a few use case scenarios, we exhibit SnappyData as our our in-memory solution for delivering truly interactive analytics (i.e., a couple of seconds), when faced with large data volumes or high velocity streams. We show that SnappyData can exploit state-of-the-art approximate query processing techniques and a variety of data synopses. Finally, we allow the audience to define various high-level accuracy contracts (HAC), to communicate their accuracy requirements with SnappyData in an intuitive fashion.

查看原文本刊更多论文

SnappyData:一个基于Spark的混合事务性分析存储

近年来，我们的客户对使用不同产品的组合来处理他们的流、事务和分析需求的传统方法表示失望。以定制方式拼接异构环境的常见做法增加了开发复杂性和总拥有成本，从而导致了巨大的生产问题。通过SnappyData这个开源平台，我们提出了一个统一的实时操作分析引擎，在一个集成的解决方案中提供流分析、OLTP和OLAP。我们通过Apache Spark(作为一个大数据计算引擎)和GemFire(作为一个具有横向扩展SQL语义的内存事务存储)的无缝集成来实现这个平台。在这个演示中，在展示了几个用例场景之后，我们展示了SnappyData作为我们的内存解决方案，用于在面对大数据量或高速流时交付真正的交互式分析(即，几秒钟)。我们展示了SnappyData可以利用最先进的近似查询处理技术和各种数据概要。最后，我们允许用户定义各种高级精度契约(HAC)，以直观的方式与SnappyData交流他们的精度需求。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

求助全文

约1分钟内获得全文求助全文

来源期刊

Proceedings of the 2016 International Conference on Management of Data

自引率

0.00%

发文量