Rein:通过多get调度控制键值存储的尾部延迟

Proceedings of the Twelfth European Conference on Computer Systems Pub Date : 2017-04-23 DOI:10.1145/3064176.3064209

Waleed Reda, M. Canini, P. Suresh, Dejan Kostic, Sean Braithwaite

{"title":"Rein:通过多get调度控制键值存储的尾部延迟","authors":"Waleed Reda, M. Canini, P. Suresh, Dejan Kostic, Sean Braithwaite","doi":"10.1145/3064176.3064209","DOIUrl":null,"url":null,"abstract":"We tackle the problem of reducing tail latencies in distributed key-value stores, such as the popular Cassandra database. We focus on workloads of multiget requests, which batch together access to several data elements and parallelize read operations across the data store machines. We first analyze a production trace of a real system and quantify the skew due to multiget sizes, key popularity, and other factors. We then proceed to identify opportunities for reduction of tail latencies by recognizing the composition of aggregate requests and by carefully scheduling bottleneck operations that can otherwise create excessive queues. We design and implement a system called Rein, which reduces latency via inter-multiget scheduling using low overhead techniques. We extensively evaluate Rein via experiments in Amazon Web Services (AWS) and simulations. Our scheduling algorithms reduce the median, 95th, and 99th percentile latencies by factors of 1.5, 1.5, and 1.9, respectively.","PeriodicalId":262089,"journal":{"name":"Proceedings of the Twelfth European Conference on Computer Systems","volume":"44 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2017-04-23","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"46","resultStr":"{\"title\":\"Rein: Taming Tail Latency in Key-Value Stores via Multiget Scheduling\",\"authors\":\"Waleed Reda, M. Canini, P. Suresh, Dejan Kostic, Sean Braithwaite\",\"doi\":\"10.1145/3064176.3064209\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"We tackle the problem of reducing tail latencies in distributed key-value stores, such as the popular Cassandra database. We focus on workloads of multiget requests, which batch together access to several data elements and parallelize read operations across the data store machines. We first analyze a production trace of a real system and quantify the skew due to multiget sizes, key popularity, and other factors. We then proceed to identify opportunities for reduction of tail latencies by recognizing the composition of aggregate requests and by carefully scheduling bottleneck operations that can otherwise create excessive queues. We design and implement a system called Rein, which reduces latency via inter-multiget scheduling using low overhead techniques. We extensively evaluate Rein via experiments in Amazon Web Services (AWS) and simulations. Our scheduling algorithms reduce the median, 95th, and 99th percentile latencies by factors of 1.5, 1.5, and 1.9, respectively.\",\"PeriodicalId\":262089,\"journal\":{\"name\":\"Proceedings of the Twelfth European Conference on Computer Systems\",\"volume\":\"44 1\",\"pages\":\"0\"},\"PeriodicalIF\":0.0000,\"publicationDate\":\"2017-04-23\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"46\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"Proceedings of the Twelfth European Conference on Computer Systems\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/10.1145/3064176.3064209\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"Proceedings of the Twelfth European Conference on Computer Systems","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1145/3064176.3064209","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}

引用次数: 46

摘要

我们解决了在分布式键值存储(如流行的Cassandra数据库)中减少尾部延迟的问题。我们关注多get请求的工作负载，这些请求将对多个数据元素的访问批处理在一起，并在数据存储机器上并行化读取操作。我们首先分析一个真实系统的生产轨迹，并量化由于多器件尺寸、密钥流行度和其他因素造成的偏差。然后，我们通过识别聚合请求的组成和仔细调度瓶颈操作(否则可能会创建过多的队列)来确定减少尾部延迟的机会。我们设计并实现了一个名为Rein的系统，它通过使用低开销技术通过多get间调度来减少延迟。我们通过亚马逊网络服务(AWS)的实验和模拟对Rein进行了广泛的评估。我们的调度算法将中位数、第95百分位和第99百分位延迟分别降低了1.5、1.5和1.9倍。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

查看原文本刊更多论文

Rein: Taming Tail Latency in Key-Value Stores via Multiget Scheduling

We tackle the problem of reducing tail latencies in distributed key-value stores, such as the popular Cassandra database. We focus on workloads of multiget requests, which batch together access to several data elements and parallelize read operations across the data store machines. We first analyze a production trace of a real system and quantify the skew due to multiget sizes, key popularity, and other factors. We then proceed to identify opportunities for reduction of tail latencies by recognizing the composition of aggregate requests and by carefully scheduling bottleneck operations that can otherwise create excessive queues. We design and implement a system called Rein, which reduces latency via inter-multiget scheduling using low overhead techniques. We extensively evaluate Rein via experiments in Amazon Web Services (AWS) and simulations. Our scheduling algorithms reduce the median, 95th, and 99th percentile latencies by factors of 1.5, 1.5, and 1.9, respectively.

求助全文

通过发布文献求助，成功后即可免费获取论文全文。去求助

来源期刊

Proceedings of the Twelfth European Conference on Computer Systems

自引率

0.00%

发文量