{"title":"Atlas, a modular and efficient open-source BFT framework","authors":"Nuno Neto , Rolando Martins , Luís Veiga","doi":"10.1016/j.jss.2024.112317","DOIUrl":null,"url":null,"abstract":"<div><div>Over the last few decades, a large body of research was carried out covering Byzantine Fault Tolerance (BFT) systems. This research has brought forward new techniques, including but not limited, for ordering operations (Abraham et al., 2018; Buchman, 2016; Guo et al., 2020; Bessani et al., 2014; Duan et al., 2018) and state transfer (Bessani et al., 2013; <span><span>Distler, 2021</span></span>, <span><span>Eischer et al., 2019</span></span>), on networks that suffer from byzantine faults. More recently, the ongoing research on distributed ledgers re-ignited the interest on BFT, due to its high throughput when compared to other alternatives of byzantine consensus (<span><span>Vukolić, 2016</span></span>).</div><div>In this paper we present three contributions covering several aspects, including modular and extensible framework design and implementation, system optimization through development of better networking alternatives, a greater use of parallelism, several ordering protocol improvements and extensive comparative assessment of previous state-of-the-art approaches.</div><div>First, we introduce Atlas, an open-source modular BFT framework that aims to support the research and development of highly efficient BFT protocols, by decoupling traditionally entangled sub-protocols, e.g., consensus primitive from the execution (Bessani et al., 2014), and deferment of log management to replicated services from state transfer. Atlas allows to further provide modules that can be re-used across different BFT approaches, such as deterministic and probabilistic/randomized models.</div><div>Second, we present FeBFT, a new BFT implementation developed upon Atlas that combines pre-existing proven ideas from PBFTs, namely its 3-phase consensus and view-change protocol. This base approach is then extended with novel optimizations of the protocol, namely, multi-leader proposals (Stathakopoulou et al., 2019), multi-instance consensus execution (Stathakopoulou et al., 2022; Behl et al., 2015), and configurable batching solution that allow us to reduce the latency while improving throughput at the same time.</div><div>Third, we offer a comprehensive evaluation amongst our work and other state-of-the-art BFT-SMR implementations, namely, Atlas (<span><span>Neto et al., 2024a</span></span>) with FeBFT (Official febft repository 2024), BFT-SMaRt (Bessani et al., 2014) and Themis (Rüsch et al., 2019).</div><div>With these contributions, we aim to lay the ground work to: (i) improve reusability and hence productivity in BFT(-SMR) development; (ii) increase system safety, performance, scalability and reduce recovery time with the optimizations proposed; (iii) draw insights on the bottlenecks preventing order-of-magnitude improvements in BFT processing from a system’s perspective; and lastly, (iv) improve reproducibility between different BFT (sub-)protocols by allowing for true apples-to-apples comparisons.</div></div>","PeriodicalId":51099,"journal":{"name":"Journal of Systems and Software","volume":"222 ","pages":"Article 112317"},"PeriodicalIF":3.7000,"publicationDate":"2024-12-09","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Journal of Systems and Software","FirstCategoryId":"94","ListUrlMain":"https://www.sciencedirect.com/science/article/pii/S0164121224003613","RegionNum":2,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q1","JCRName":"COMPUTER SCIENCE, SOFTWARE ENGINEERING","Score":null,"Total":0}
引用次数: 0
Abstract
Over the last few decades, a large body of research was carried out covering Byzantine Fault Tolerance (BFT) systems. This research has brought forward new techniques, including but not limited, for ordering operations (Abraham et al., 2018; Buchman, 2016; Guo et al., 2020; Bessani et al., 2014; Duan et al., 2018) and state transfer (Bessani et al., 2013; Distler, 2021, Eischer et al., 2019), on networks that suffer from byzantine faults. More recently, the ongoing research on distributed ledgers re-ignited the interest on BFT, due to its high throughput when compared to other alternatives of byzantine consensus (Vukolić, 2016).
In this paper we present three contributions covering several aspects, including modular and extensible framework design and implementation, system optimization through development of better networking alternatives, a greater use of parallelism, several ordering protocol improvements and extensive comparative assessment of previous state-of-the-art approaches.
First, we introduce Atlas, an open-source modular BFT framework that aims to support the research and development of highly efficient BFT protocols, by decoupling traditionally entangled sub-protocols, e.g., consensus primitive from the execution (Bessani et al., 2014), and deferment of log management to replicated services from state transfer. Atlas allows to further provide modules that can be re-used across different BFT approaches, such as deterministic and probabilistic/randomized models.
Second, we present FeBFT, a new BFT implementation developed upon Atlas that combines pre-existing proven ideas from PBFTs, namely its 3-phase consensus and view-change protocol. This base approach is then extended with novel optimizations of the protocol, namely, multi-leader proposals (Stathakopoulou et al., 2019), multi-instance consensus execution (Stathakopoulou et al., 2022; Behl et al., 2015), and configurable batching solution that allow us to reduce the latency while improving throughput at the same time.
Third, we offer a comprehensive evaluation amongst our work and other state-of-the-art BFT-SMR implementations, namely, Atlas (Neto et al., 2024a) with FeBFT (Official febft repository 2024), BFT-SMaRt (Bessani et al., 2014) and Themis (Rüsch et al., 2019).
With these contributions, we aim to lay the ground work to: (i) improve reusability and hence productivity in BFT(-SMR) development; (ii) increase system safety, performance, scalability and reduce recovery time with the optimizations proposed; (iii) draw insights on the bottlenecks preventing order-of-magnitude improvements in BFT processing from a system’s perspective; and lastly, (iv) improve reproducibility between different BFT (sub-)protocols by allowing for true apples-to-apples comparisons.
期刊介绍:
The Journal of Systems and Software publishes papers covering all aspects of software engineering and related hardware-software-systems issues. All articles should include a validation of the idea presented, e.g. through case studies, experiments, or systematic comparisons with other approaches already in practice. Topics of interest include, but are not limited to:
•Methods and tools for, and empirical studies on, software requirements, design, architecture, verification and validation, maintenance and evolution
•Agile, model-driven, service-oriented, open source and global software development
•Approaches for mobile, multiprocessing, real-time, distributed, cloud-based, dependable and virtualized systems
•Human factors and management concerns of software development
•Data management and big data issues of software systems
•Metrics and evaluation, data mining of software development resources
•Business and economic aspects of software development processes
The journal welcomes state-of-the-art surveys and reports of practical experience for all of these topics.