Boosting concurrency in Parallel State Machine Replication

Proceedings of the 20th International Middleware Conference Pub Date : 2019-12-09 DOI:10.1145/3361525.3361549

Ian Aragon Escobar, E. Alchieri, F. Dotti, F. Pedone

{"title":"Boosting concurrency in Parallel State Machine Replication","authors":"Ian Aragon Escobar, E. Alchieri, F. Dotti, F. Pedone","doi":"10.1145/3361525.3361549","DOIUrl":null,"url":null,"abstract":"State machine replication (SMR) is a well-known approach to implementing fault-tolerant services, providing high availability and strong consistency. To boost the performance of SMR, some proposals execute independent commands concurrently, while dependent commands execute sequentially in the total delivery order. The most general approach to handling command dependencies resorts to a directed acyclic graph (DAG), where nodes represent commands and edges represent dependencies. In this paper we show that due to the command arrival and multithreaded execution rates of SMR, a highly concurrent implementation of a DAG is needed. We show that a typical coarse-grained DAG implementation, where the whole graph is a critical section, results in a bottleneck in the replica. We propose two improvements to the coarse-grained DAG approach: fine-grained algorithms, using lock-coupling, and lock-free algorithms. Our fine-grain algorithms lock individual vertices in the DAG. The lock-free algorithms use nonblocking synchronization, with atomic operations, and lazy synchronization to postpone physical removal of nodes. All algorithms were integrated in a parallel SMR prototype. Experimental evaluation revealed that the fine-grained algorithms are also subject to a bottleneck. The lock-free implementation, however, sports linear speedup with the number of working threads, in some cases scaling up to 64 threads.","PeriodicalId":381253,"journal":{"name":"Proceedings of the 20th International Middleware Conference","volume":"1 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2019-12-09","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"14","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Proceedings of the 20th International Middleware Conference","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1145/3361525.3361549","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}

引用次数: 14

Abstract

State machine replication (SMR) is a well-known approach to implementing fault-tolerant services, providing high availability and strong consistency. To boost the performance of SMR, some proposals execute independent commands concurrently, while dependent commands execute sequentially in the total delivery order. The most general approach to handling command dependencies resorts to a directed acyclic graph (DAG), where nodes represent commands and edges represent dependencies. In this paper we show that due to the command arrival and multithreaded execution rates of SMR, a highly concurrent implementation of a DAG is needed. We show that a typical coarse-grained DAG implementation, where the whole graph is a critical section, results in a bottleneck in the replica. We propose two improvements to the coarse-grained DAG approach: fine-grained algorithms, using lock-coupling, and lock-free algorithms. Our fine-grain algorithms lock individual vertices in the DAG. The lock-free algorithms use nonblocking synchronization, with atomic operations, and lazy synchronization to postpone physical removal of nodes. All algorithms were integrated in a parallel SMR prototype. Experimental evaluation revealed that the fine-grained algorithms are also subject to a bottleneck. The lock-free implementation, however, sports linear speedup with the number of working threads, in some cases scaling up to 64 threads.

查看原文本刊更多论文

提高并行状态机复制中的并发性

状态机复制(SMR)是一种众所周知的实现容错服务的方法，提供高可用性和强一致性。为了提高SMR的性能，一些建议并发执行独立命令，而依赖命令在总交付顺序中顺序执行。处理命令依赖关系的最通用方法是使用有向无环图(DAG)，其中节点表示命令，边表示依赖关系。在本文中，我们展示了由于SMR的命令到达和多线程执行速率，需要DAG的高度并发实现。我们展示了一个典型的粗粒度DAG实现，其中整个图是一个关键部分，会导致副本出现瓶颈。我们提出了对粗粒度DAG方法的两种改进:使用锁耦合的细粒度算法和无锁算法。我们的细粒度算法锁定DAG中的各个顶点。无锁算法使用非阻塞同步(带有原子操作)和延迟同步来延迟节点的物理移除。将所有算法集成到一个并行SMR原型中。实验评估表明，细粒度算法也会受到瓶颈的影响。但是，无锁实现随着工作线程的数量呈线性加速，在某些情况下可扩展到64个线程。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

求助全文

约1分钟内获得全文求助全文

来源期刊

Proceedings of the 20th International Middleware Conference

自引率

0.00%

发文量