{"title":"可扩展crdt的概率因果背景","authors":"Pedro Henrique Fernandes, Carlos Baquero","doi":"10.1145/3578358.3591331","DOIUrl":null,"url":null,"abstract":"Conflict-free Replicated Data Types (CRDTs) are useful to allow a distributed system to operate on data even when partitions occur, and thus preserve operational availability. Most CRDTs need to track whether data evolved concurrently at different nodes and needs to be reconciled; this requires storing causality metadata that is proportional to the number of nodes. In this paper, we try to overcome this limitation by introducing a stochastic mechanism that is no longer linear on the number of nodes, but whose accuracy is now tied to how much divergence occurs between synchronizations. This provides a new tool that can be useful in deployments with many anonymous nodes and frequent synchronizations. However, there is an underlying trade-off with classic deterministic solutions, since the approach is now probabilistic and the accuracy depends on the configurable metadata space size.","PeriodicalId":198398,"journal":{"name":"Proceedings of the 10th Workshop on Principles and Practice of Consistency for Distributed Data","volume":null,"pages":null},"PeriodicalIF":0.0000,"publicationDate":"2023-05-08","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":"{\"title\":\"Probabilistic Causal Contexts for Scalable CRDTs\",\"authors\":\"Pedro Henrique Fernandes, Carlos Baquero\",\"doi\":\"10.1145/3578358.3591331\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"Conflict-free Replicated Data Types (CRDTs) are useful to allow a distributed system to operate on data even when partitions occur, and thus preserve operational availability. Most CRDTs need to track whether data evolved concurrently at different nodes and needs to be reconciled; this requires storing causality metadata that is proportional to the number of nodes. In this paper, we try to overcome this limitation by introducing a stochastic mechanism that is no longer linear on the number of nodes, but whose accuracy is now tied to how much divergence occurs between synchronizations. This provides a new tool that can be useful in deployments with many anonymous nodes and frequent synchronizations. However, there is an underlying trade-off with classic deterministic solutions, since the approach is now probabilistic and the accuracy depends on the configurable metadata space size.\",\"PeriodicalId\":198398,\"journal\":{\"name\":\"Proceedings of the 10th Workshop on Principles and Practice of Consistency for Distributed Data\",\"volume\":null,\"pages\":null},\"PeriodicalIF\":0.0000,\"publicationDate\":\"2023-05-08\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"0\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"Proceedings of the 10th Workshop on Principles and Practice of Consistency for Distributed Data\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/10.1145/3578358.3591331\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"Proceedings of the 10th Workshop on Principles and Practice of Consistency for Distributed Data","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1145/3578358.3591331","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
Conflict-free Replicated Data Types (CRDTs) are useful to allow a distributed system to operate on data even when partitions occur, and thus preserve operational availability. Most CRDTs need to track whether data evolved concurrently at different nodes and needs to be reconciled; this requires storing causality metadata that is proportional to the number of nodes. In this paper, we try to overcome this limitation by introducing a stochastic mechanism that is no longer linear on the number of nodes, but whose accuracy is now tied to how much divergence occurs between synchronizations. This provides a new tool that can be useful in deployments with many anonymous nodes and frequent synchronizations. However, there is an underlying trade-off with classic deterministic solutions, since the approach is now probabilistic and the accuracy depends on the configurable metadata space size.