Ian Aragon Escobar, E. Alchieri, F. Dotti, F. Pedone
{"title":"Boosting concurrency in Parallel State Machine Replication","authors":"Ian Aragon Escobar, E. Alchieri, F. Dotti, F. Pedone","doi":"10.1145/3361525.3361549","DOIUrl":null,"url":null,"abstract":"State machine replication (SMR) is a well-known approach to implementing fault-tolerant services, providing high availability and strong consistency. To boost the performance of SMR, some proposals execute independent commands concurrently, while dependent commands execute sequentially in the total delivery order. The most general approach to handling command dependencies resorts to a directed acyclic graph (DAG), where nodes represent commands and edges represent dependencies. In this paper we show that due to the command arrival and multithreaded execution rates of SMR, a highly concurrent implementation of a DAG is needed. We show that a typical coarse-grained DAG implementation, where the whole graph is a critical section, results in a bottleneck in the replica. We propose two improvements to the coarse-grained DAG approach: fine-grained algorithms, using lock-coupling, and lock-free algorithms. Our fine-grain algorithms lock individual vertices in the DAG. The lock-free algorithms use nonblocking synchronization, with atomic operations, and lazy synchronization to postpone physical removal of nodes. All algorithms were integrated in a parallel SMR prototype. Experimental evaluation revealed that the fine-grained algorithms are also subject to a bottleneck. The lock-free implementation, however, sports linear speedup with the number of working threads, in some cases scaling up to 64 threads.","PeriodicalId":381253,"journal":{"name":"Proceedings of the 20th International Middleware Conference","volume":"1 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2019-12-09","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"14","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Proceedings of the 20th International Middleware Conference","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1145/3361525.3361549","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 14
Abstract
State machine replication (SMR) is a well-known approach to implementing fault-tolerant services, providing high availability and strong consistency. To boost the performance of SMR, some proposals execute independent commands concurrently, while dependent commands execute sequentially in the total delivery order. The most general approach to handling command dependencies resorts to a directed acyclic graph (DAG), where nodes represent commands and edges represent dependencies. In this paper we show that due to the command arrival and multithreaded execution rates of SMR, a highly concurrent implementation of a DAG is needed. We show that a typical coarse-grained DAG implementation, where the whole graph is a critical section, results in a bottleneck in the replica. We propose two improvements to the coarse-grained DAG approach: fine-grained algorithms, using lock-coupling, and lock-free algorithms. Our fine-grain algorithms lock individual vertices in the DAG. The lock-free algorithms use nonblocking synchronization, with atomic operations, and lazy synchronization to postpone physical removal of nodes. All algorithms were integrated in a parallel SMR prototype. Experimental evaluation revealed that the fine-grained algorithms are also subject to a bottleneck. The lock-free implementation, however, sports linear speedup with the number of working threads, in some cases scaling up to 64 threads.