可重构的原子事务提交

Proceedings of the 2019 ACM Symposium on Principles of Distributed Computing Pub Date : 2019-06-04 DOI:10.1145/3293611.3331590

Manuel Bravo, Alexey Gotsman

{"title":"可重构的原子事务提交","authors":"Manuel Bravo, Alexey Gotsman","doi":"10.1145/3293611.3331590","DOIUrl":null,"url":null,"abstract":"Modern data stores achieve scalability by partitioning data into shards and fault-tolerance by replicating each shard across several servers. A key component of such systems is a Transaction Certification Service (TCS), which atomically commits a transaction spanning multiple shards. Existing TCS protocols require 2f+1 crash-stop replicas per shard to tolerate f failures. In this paper we present atomic commit protocols that require only f+1 replicas and reconfigure the system upon failures using an external reconfiguration service. We furthermore rigorously prove that these protocols correctly implement a recently proposed TCS specification. We present protocols in two different models---the standard asynchronous message-passing model and a model with Remote Direct Memory Access (RDMA), which allows a machine to access the memory of another machine over the network without involving the latter's CPU. Our protocols are inspired by a recent FARM system for RDMA-based transaction processing. Our work codifies the core ideas of FARM as distributed TCS protocols, rigorously proves them correct and highlights the trade-offs required by the use of RDMA.","PeriodicalId":153766,"journal":{"name":"Proceedings of the 2019 ACM Symposium on Principles of Distributed Computing","volume":"144 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2019-06-04","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"4","resultStr":"{\"title\":\"Reconfigurable Atomic Transaction Commit\",\"authors\":\"Manuel Bravo, Alexey Gotsman\",\"doi\":\"10.1145/3293611.3331590\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"Modern data stores achieve scalability by partitioning data into shards and fault-tolerance by replicating each shard across several servers. A key component of such systems is a Transaction Certification Service (TCS), which atomically commits a transaction spanning multiple shards. Existing TCS protocols require 2f+1 crash-stop replicas per shard to tolerate f failures. In this paper we present atomic commit protocols that require only f+1 replicas and reconfigure the system upon failures using an external reconfiguration service. We furthermore rigorously prove that these protocols correctly implement a recently proposed TCS specification. We present protocols in two different models---the standard asynchronous message-passing model and a model with Remote Direct Memory Access (RDMA), which allows a machine to access the memory of another machine over the network without involving the latter's CPU. Our protocols are inspired by a recent FARM system for RDMA-based transaction processing. Our work codifies the core ideas of FARM as distributed TCS protocols, rigorously proves them correct and highlights the trade-offs required by the use of RDMA.\",\"PeriodicalId\":153766,\"journal\":{\"name\":\"Proceedings of the 2019 ACM Symposium on Principles of Distributed Computing\",\"volume\":\"144 1\",\"pages\":\"0\"},\"PeriodicalIF\":0.0000,\"publicationDate\":\"2019-06-04\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"4\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"Proceedings of the 2019 ACM Symposium on Principles of Distributed Computing\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/10.1145/3293611.3331590\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"Proceedings of the 2019 ACM Symposium on Principles of Distributed Computing","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1145/3293611.3331590","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}

引用次数: 4

摘要

现代数据存储通过将数据划分为多个分片来实现可伸缩性，并通过跨多个服务器复制每个分片来实现容错。此类系统的一个关键组件是事务认证服务(Transaction Certification Service, TCS)，它自动提交跨多个分片的事务。现有的TCS协议要求每个分片有2f+1个崩溃停止副本以容忍故障。在本文中，我们介绍了只需要f+1个副本的原子提交协议，并使用外部重新配置服务在出现故障时重新配置系统。我们进一步严格证明了这些协议正确地实现了最近提出的TCS规范。我们提出了两种不同模型中的协议——标准的异步消息传递模型和具有远程直接内存访问(RDMA)的模型，RDMA允许一台机器通过网络访问另一台机器的内存，而不涉及后者的CPU。我们的协议的灵感来自于最近基于rdma的事务处理FARM系统。我们的工作将FARM的核心思想编入分布式TCS协议，严格证明它们是正确的，并强调了使用RDMA所需的权衡。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

查看原文本刊更多论文

Reconfigurable Atomic Transaction Commit

Modern data stores achieve scalability by partitioning data into shards and fault-tolerance by replicating each shard across several servers. A key component of such systems is a Transaction Certification Service (TCS), which atomically commits a transaction spanning multiple shards. Existing TCS protocols require 2f+1 crash-stop replicas per shard to tolerate f failures. In this paper we present atomic commit protocols that require only f+1 replicas and reconfigure the system upon failures using an external reconfiguration service. We furthermore rigorously prove that these protocols correctly implement a recently proposed TCS specification. We present protocols in two different models---the standard asynchronous message-passing model and a model with Remote Direct Memory Access (RDMA), which allows a machine to access the memory of another machine over the network without involving the latter's CPU. Our protocols are inspired by a recent FARM system for RDMA-based transaction processing. Our work codifies the core ideas of FARM as distributed TCS protocols, rigorously proves them correct and highlights the trade-offs required by the use of RDMA.

求助全文

通过发布文献求助，成功后即可免费获取论文全文。去求助

来源期刊

Proceedings of the 2019 ACM Symposium on Principles of Distributed Computing

自引率

0.00%

发文量