{"title":"AMC: Towards Trustworthy and Explorable CRDT Applications with the Automerge Model Checker","authors":"A. Jeffery, R. Mortier","doi":"10.1145/3578358.3591326","DOIUrl":"https://doi.org/10.1145/3578358.3591326","url":null,"abstract":"Conflict-free Replicated Data Types (CRDTs) enable local-first operations and asynchronous collaboration without the need for always-on centralised services. CRDTs can have a high overhead, so implementations need to be optimised, but this optimisation can lead to bugs despite the use of test suites and fuzzing. Furthermore, using CRDTs in applications is complex, observing unexpected conflict resolution, issues synchronising documents and difficulties implementing appropriate data models. Automerge is a library, exposing a JSON CRDT, that sees users having difficulties in modelling their problems, understanding their edge cases and implementing applications correctly. We introduce the Automerge Model Checker (AMC), empowering application developers to check properties about their implementations and explore them dynamically. AMC can check a range of applications as well as being able to check properties about the core of Automerge itself, helping to make more trustworthy Automerge applications. AMC is available open-source at github.com/jeffa5/automerge-model-checker.","PeriodicalId":198398,"journal":{"name":"Proceedings of the 10th Workshop on Principles and Practice of Consistency for Distributed Data","volume":null,"pages":null},"PeriodicalIF":0.0,"publicationDate":"2023-05-08","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"122126343","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Probabilistic Causal Contexts for Scalable CRDTs","authors":"Pedro Henrique Fernandes, Carlos Baquero","doi":"10.1145/3578358.3591331","DOIUrl":"https://doi.org/10.1145/3578358.3591331","url":null,"abstract":"Conflict-free Replicated Data Types (CRDTs) are useful to allow a distributed system to operate on data even when partitions occur, and thus preserve operational availability. Most CRDTs need to track whether data evolved concurrently at different nodes and needs to be reconciled; this requires storing causality metadata that is proportional to the number of nodes. In this paper, we try to overcome this limitation by introducing a stochastic mechanism that is no longer linear on the number of nodes, but whose accuracy is now tied to how much divergence occurs between synchronizations. This provides a new tool that can be useful in deployments with many anonymous nodes and frequent synchronizations. However, there is an underlying trade-off with classic deterministic solutions, since the approach is now probabilistic and the accuracy depends on the configurable metadata space size.","PeriodicalId":198398,"journal":{"name":"Proceedings of the 10th Workshop on Principles and Practice of Consistency for Distributed Data","volume":null,"pages":null},"PeriodicalIF":0.0,"publicationDate":"2023-05-08","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"123531414","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Verify, And Then Trust: Data Inconsistency Detection in ZooKeeper","authors":"Sushant Mane, Fang Lyu, B. Reed","doi":"10.1145/3578358.3591328","DOIUrl":"https://doi.org/10.1145/3578358.3591328","url":null,"abstract":"ZooKeeper masks crash failure of servers to provide a highly available, distributed coordination kernel; however, in production, not all failures are crash failures. Bugs in underlying software systems and hardware can corrupt the ZooKeeper replicas, leading to data loss. Since ZooKeeper is used as a 'source of truth' for mission-critical applications, it essential to detect data inconsistencies caused by arbitrary faults to safeguard reliability. Byzantine Fault Tolerance (BFT) promises to handle these problems. However, these protocols are expensive in important dimensions: development, deployment, complexity, and performance. ZooKeeper takes an alternative approach that focuses on detecting faulty behavior rather than tolerating it and thus providing improved reliability without paying the full expense of BFT protocols. This paper describes various techniques used for detecting data inconsistencies in ZooKeeper. We also analyzed the impact of using these techniques on the reliability and performance of the overall system. Our evaluation shows that a real-time digest-based fault detection technique can be employed in production to provide improved reliability with a minimal performance penalty and no additional operational cost. We hope that our analysis and evaluation can help guide the design of next-generation primary-backup systems aiming to provide high reliability.","PeriodicalId":198398,"journal":{"name":"Proceedings of the 10th Workshop on Principles and Practice of Consistency for Distributed Data","volume":null,"pages":null},"PeriodicalIF":0.0,"publicationDate":"2023-05-08","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"115798060","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"A Study of Semantics for CRDT-based Collaborative Spreadsheets","authors":"Elena Yanakieva, Philipp Bird, Annette Bieniusa","doi":"10.1145/3578358.3591324","DOIUrl":"https://doi.org/10.1145/3578358.3591324","url":null,"abstract":"Online collaboration is becoming prevalent in our day-today work. As commercial applications show, next to texts, spreadsheets are an essential tool for storing and organizing shared data. However, concurrent modifications of a collaborative spreadsheet can lead to unexpected results when they reflect implementation decisions rather than user intention. With this paper, we systematically discuss spreadsheet operations and their semantics and propose intention-preserving designs in a concurrent decentralized setting, thus supporting offline operations. We further explore different data models for shared spreadsheets based on composed Conflict-free Replicated Data Types (CRDTs) and give an implementation in the local-first framework Yjs.","PeriodicalId":198398,"journal":{"name":"Proceedings of the 10th Workshop on Principles and Practice of Consistency for Distributed Data","volume":null,"pages":null},"PeriodicalIF":0.0,"publicationDate":"2023-05-08","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"121656757","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Nuno Santos, L. M. Silva, J. Leitao, Nuno M. Preguiça
{"title":"Data Management for mobile applications dependent on geo-located data","authors":"Nuno Santos, L. M. Silva, J. Leitao, Nuno M. Preguiça","doi":"10.1145/3578358.3591334","DOIUrl":"https://doi.org/10.1145/3578358.3591334","url":null,"abstract":"An increasing number of mobile applications share location-dependent information, from collaborative applications and social networks to location-based games. In such applications, users are interested in information related to their immediate surroundings or destination when moving instead of data referring to events or state in distant areas. The current database systems enforce uniform consistency models that do not take into consideration data geographical locality, requiring applications to implement ad-hoc solutions that are sub-optimal at best, and can lead to poor performance in the worst case. In this paper, we argue in favour of consistency models where data location is a key property of data items that is leveraged to govern the operation of replication protocols and the guarantees provided to data accessed by users. To illustrate this, we present FocusDB, a new data management system designed to leverage both object and client location to combine stronger and weaker levels of consistency on a per-object basis. The system discussed here represents a first step in a larger ongoing research effort focused on deriving new consistency models and replication protocols that leverage our previous observation.","PeriodicalId":198398,"journal":{"name":"Proceedings of the 10th Workshop on Principles and Practice of Consistency for Distributed Data","volume":null,"pages":null},"PeriodicalIF":0.0,"publicationDate":"2023-05-08","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"116267938","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Generic Checkpointing Support for Stream-based State-Machine Replication","authors":"Laura Lawniczak, Marco Ammon, T. Distler","doi":"10.1145/3578358.3591329","DOIUrl":"https://doi.org/10.1145/3578358.3591329","url":null,"abstract":"Stream-based replication facilitates the deployment and operation of state-machine replication protocols by running them as applications on top of data-stream processing frameworks. Taking advantage of platform-provided features, this approach makes it possible to significantly minimize implementation complexity at the protocol level. To further extend the associated benefits, in this paper we examine how the concept can be used to provide generic support for creating, storing, and applying checkpoints of replica states, both in the use case for catch up and garbage collection as well as to recover failed replicas. Specifically, we present three checkpointing-mechanism designs with different degrees of platform involvement and evaluate them in the context of Twitter's stream-processing engine Heron.","PeriodicalId":198398,"journal":{"name":"Proceedings of the 10th Workshop on Principles and Practice of Consistency for Distributed Data","volume":null,"pages":null},"PeriodicalIF":0.0,"publicationDate":"2023-05-08","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"133956638","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"[Short paper] Towards improved collaborative text editing CRDTs by using Natural Language Processing","authors":"Jim Bauwens, Kevin De Porre, Elisa Gonzalez Boix","doi":"10.1145/3578358.3591330","DOIUrl":"https://doi.org/10.1145/3578358.3591330","url":null,"abstract":"Collaborative text editing systems are used in a variety of cloud-based products. To ensure that documents remain consistent between users, these systems often rely on CRDTs, operational transformation, or other techniques for achieving (strong) eventual consistency. CRDT-based approaches are appealing as they incorporate strategies to ensure that concurrent updates cannot conflict. However, these strategies do not necessarily take into account program semantics and may result in unexpected behaviour from the end-user's perspective. For example, conflict resolution strategies in collaborative text editors may lead to duplicate words and incorrectly merged sentences. This position paper investigates the use of deterministic natural language processing (NLP) algorithms to improve the concurrency semantics of collaborative text editing systems that rely on CRDTs, aiming to provide a better end-user experience. We explore what is needed to ensure convergence, and highlight potential difficulties with the approach.","PeriodicalId":198398,"journal":{"name":"Proceedings of the 10th Workshop on Principles and Practice of Consistency for Distributed Data","volume":null,"pages":null},"PeriodicalIF":0.0,"publicationDate":"2023-05-08","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"129963308","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Linearizable Low-latency Reads at the Edge","authors":"Joshua Guarnieri, Aleksey Charapko","doi":"10.1145/3578358.3591327","DOIUrl":"https://doi.org/10.1145/3578358.3591327","url":null,"abstract":"Edge computing enables moving data closer to users to reduce latency and improve user experience. Edge data centers are capable and reliable enough to support various data management solutions, such as caches and data stores. Unfortunately, edge storage systems sacrifice consistency to benefit from geographical proximity to users. In this paper, we present EdgePQR, a strongly consistent, edge-aware data store that allows edge clients to read \"hot\" data locally at the edge with low latency. EdgePQR relies on the piece-wise defined quorums consisting of nodes in a core cloud system and one or more edge data centers to replicate data to the edge. It then uses an edge quorum to query data locally. EdgePQR ensures safety by enforcing intersections between all vital quorums: any leader election and replication quorums intersect, and any replication and edge-read quorums intersect as well.","PeriodicalId":198398,"journal":{"name":"Proceedings of the 10th Workshop on Principles and Practice of Consistency for Distributed Data","volume":null,"pages":null},"PeriodicalIF":0.0,"publicationDate":"2023-05-08","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"134512424","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Performance Trade-offs in Transactional Systems","authors":"Rafael Soares, Luís Rodrigues","doi":"10.1145/3578358.3591325","DOIUrl":"https://doi.org/10.1145/3578358.3591325","url":null,"abstract":"During the last decade a number of systems supporting different forms of distributed transactions have been proposed, each implementing a different performance trade-off in the design space. In this paper we collect the performance features that have been identified by previous works and offer a systematic analysis of known results regarding the impossibility of achieving certain combinations of desirable properties along these dimensions. We also compare previous transactional systems in the light of this set of desirable performance aspects. Finally we discuss how certain combinations of features may be leverage to guide new transactional system designs.","PeriodicalId":198398,"journal":{"name":"Proceedings of the 10th Workshop on Principles and Practice of Consistency for Distributed Data","volume":null,"pages":null},"PeriodicalIF":0.0,"publicationDate":"2023-05-08","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"132643039","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Trees and Turtles: Modular Abstractions for State Machine Replication Protocols","authors":"Natalie Neamtu, Haobin Ni, R. Van Renesse","doi":"10.1145/3578358.3592148","DOIUrl":"https://doi.org/10.1145/3578358.3592148","url":null,"abstract":"We present two abstractions for designing modular state machine replication (SMR) protocols: trees and turtles. A tree captures the set of possible state machine histories, while a turtle represents a subprotocol that tries to find agreement in this tree. We showcase the applicability of these abstractions by constructing crash-tolerant SMR protocols out of abstract tree turtles and providing examples of tree turtle implementations. The modularity of tree turtles allows a generic approach for adding a leader for liveness. We expect that these abstractions will simplify reasoning and formal verification of SMR protocols as well as facilitate innovation in protocol designs.","PeriodicalId":198398,"journal":{"name":"Proceedings of the 10th Workshop on Principles and Practice of Consistency for Distributed Data","volume":null,"pages":null},"PeriodicalIF":0.0,"publicationDate":"2023-04-16","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"132707655","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}