Scabbard

Encyclopedic Dictionary of Archaeology Pub Date : 2021-10-01 DOI:10.1163/9789004124356_emdt_sim_000810

Georgios Theodorakis, Fotios Kounelis, P. Pietzuch, H. Pirk

{"title":"Scabbard","authors":"Georgios Theodorakis, Fotios Kounelis, P. Pietzuch, H. Pirk","doi":"10.1163/9789004124356_emdt_sim_000810","DOIUrl":null,"url":null,"abstract":"Single-node multi-core stream processing engines (SPEs) can process hundreds of millions of tuples per second. Yet making them fault-tolerant with exactly-once semantics while retaining this performance is an open challenge: due to the limited I/O bandwidth of a single-node, it becomes infeasible to persist all stream data and operator state during execution. Instead, single-node SPEs rely on upstream distributed systems, such as Apache Kafka, to recover stream data after failure, necessitating complex cluster-based deployments. This lack of built-in fault-tolerance features has hindered the adoption of single-node SPEs.\n We describe Scabbard, the first single-node SPE that supports exactly-once fault-tolerance semantics despite limited local I/O bandwidth. Scabbard achieves this by integrating persistence operations with the query workload. Within the operator graph, Scabbard determines when to persist streams based on the selectivity of operators: by persisting streams after operators that discard data, it can substantially reduce the required I/O bandwidth. As part of the operator graph, Scabbard supports parallel persistence operations and uses markers to decide when to discard persisted data. The persisted data volume is further reduced using workload-specific compression: Scabbard monitors stream statistics and dynamically generates computationally efficient compression operators. Our experiments show that Scabbard can execute stream queries that process over 200 million tuples per second while recovering from failures with sub-second latencies.","PeriodicalId":11543,"journal":{"name":"Encyclopedic Dictionary of Archaeology","volume":"117 1","pages":""},"PeriodicalIF":0.0000,"publicationDate":"2021-10-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"1","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Encyclopedic Dictionary of Archaeology","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1163/9789004124356_emdt_sim_000810","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}

引用次数: 1

Abstract

Single-node multi-core stream processing engines (SPEs) can process hundreds of millions of tuples per second. Yet making them fault-tolerant with exactly-once semantics while retaining this performance is an open challenge: due to the limited I/O bandwidth of a single-node, it becomes infeasible to persist all stream data and operator state during execution. Instead, single-node SPEs rely on upstream distributed systems, such as Apache Kafka, to recover stream data after failure, necessitating complex cluster-based deployments. This lack of built-in fault-tolerance features has hindered the adoption of single-node SPEs. We describe Scabbard, the first single-node SPE that supports exactly-once fault-tolerance semantics despite limited local I/O bandwidth. Scabbard achieves this by integrating persistence operations with the query workload. Within the operator graph, Scabbard determines when to persist streams based on the selectivity of operators: by persisting streams after operators that discard data, it can substantially reduce the required I/O bandwidth. As part of the operator graph, Scabbard supports parallel persistence operations and uses markers to decide when to discard persisted data. The persisted data volume is further reduced using workload-specific compression: Scabbard monitors stream statistics and dynamically generates computationally efficient compression operators. Our experiments show that Scabbard can execute stream queries that process over 200 million tuples per second while recovering from failures with sub-second latencies.

查看原文本刊更多论文

鞘

单节点多核流处理引擎(spe)每秒可以处理数亿个元组。然而，在保持这种性能的同时，使它们具有一次语义的容错性是一个开放的挑战:由于单个节点的I/O带宽有限，在执行期间持久化所有流数据和操作符状态变得不可行。相反，单节点spe依赖于上游分布式系统，如Apache Kafka，在故障后恢复流数据，需要复杂的基于集群的部署。缺乏内置容错功能阻碍了单节点spe的采用。我们描述了Scabbard，它是第一个支持一次精确容错语义的单节点SPE，尽管本地I/O带宽有限。Scabbard通过将持久性操作与查询工作负载集成来实现这一点。在操作符图中，Scabbard根据操作符的选择性决定何时持久化流:通过在丢弃数据的操作符之后持久化流，它可以大大减少所需的I/O带宽。作为操作符图的一部分，Scabbard支持并行持久化操作，并使用标记来决定何时丢弃持久化数据。使用特定于工作负载的压缩，可以进一步减少持久的数据量:Scabbard监控流统计数据，并动态生成计算效率高的压缩操作符。我们的实验表明，Scabbard可以执行每秒处理超过2亿个元组的流查询，同时从亚秒级延迟的失败中恢复。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

求助全文

约1分钟内获得全文求助全文

来源期刊

Encyclopedic Dictionary of Archaeology

自引率

0.00%

发文量