Extended consistent hashing: an efficient framework for object location

24th International Conference on Distributed Computing Systems, 2004. Proceedings. Pub Date : 2004-03-24 DOI:10.1109/ICDCS.2004.1281590

Shan Lei, A. Grama

{"title":"Extended consistent hashing: an efficient framework for object location","authors":"Shan Lei, A. Grama","doi":"10.1109/ICDCS.2004.1281590","DOIUrl":null,"url":null,"abstract":"Content caching and location are key enabling technologies for achieving the high throughput needed to sustain current Internet infrastructure, both for peer-to-peer as well as client-server applications. An important aspect of distributed caching techniques is the mapping of data and requests to maximize system throughput while minimizing costs in the presence of network and cache failures. We describe a new cache protocol based on consistent hashing (CH) [D. Karger et al., (1997), (1999)]. Compared to consistent hashing, our protocol, called extended consistent hashing (ECH), can handle flash access to objects significantly better and yields better worst-case response times and lower load variance. Due to multiplicity of client views in a distributed hashing scheme, a single object (or its reference) may be cached at multiple locations. This is referred to as the spread of an object. Consistent hashing maps a request to a cache irrespective of the spread of the requested object. ECH, on the other hand, estimates the spread of an object and randomizes requests over expected spread. In doing so, it amortizes requests over a larger number of caches. While the expected load on target caches in ECH remains the same as consistent hashing (asymptotically optimal), load variance is significantly reduced. We present analytical results as well as simulations to demonstrate significant improvements for querying frequently accessed objects, up to 80% in worst-case response time and 30% in variance of server/target cache loads. We also show excellent correlation between expected and observed results. What makes ECH particularly attractive is that it can be integrated into existing infrastructure based on consistent hashing with minimal software overhead.","PeriodicalId":348300,"journal":{"name":"24th International Conference on Distributed Computing Systems, 2004. Proceedings.","volume":"27 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2004-03-24","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"10","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"24th International Conference on Distributed Computing Systems, 2004. Proceedings.","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/ICDCS.2004.1281590","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}

引用次数: 10

Abstract

Content caching and location are key enabling technologies for achieving the high throughput needed to sustain current Internet infrastructure, both for peer-to-peer as well as client-server applications. An important aspect of distributed caching techniques is the mapping of data and requests to maximize system throughput while minimizing costs in the presence of network and cache failures. We describe a new cache protocol based on consistent hashing (CH) [D. Karger et al., (1997), (1999)]. Compared to consistent hashing, our protocol, called extended consistent hashing (ECH), can handle flash access to objects significantly better and yields better worst-case response times and lower load variance. Due to multiplicity of client views in a distributed hashing scheme, a single object (or its reference) may be cached at multiple locations. This is referred to as the spread of an object. Consistent hashing maps a request to a cache irrespective of the spread of the requested object. ECH, on the other hand, estimates the spread of an object and randomizes requests over expected spread. In doing so, it amortizes requests over a larger number of caches. While the expected load on target caches in ECH remains the same as consistent hashing (asymptotically optimal), load variance is significantly reduced. We present analytical results as well as simulations to demonstrate significant improvements for querying frequently accessed objects, up to 80% in worst-case response time and 30% in variance of server/target cache loads. We also show excellent correlation between expected and observed results. What makes ECH particularly attractive is that it can be integrated into existing infrastructure based on consistent hashing with minimal software overhead.

查看原文本刊更多论文

扩展一致散列:一个有效的对象定位框架

内容缓存和位置是实现维持当前Internet基础设施所需的高吞吐量的关键启用技术，包括点对点和客户机-服务器应用程序。分布式缓存技术的一个重要方面是数据和请求的映射，以最大化系统吞吐量，同时在网络和缓存故障的情况下最小化成本。我们描述了一种基于一致性哈希(CH)的缓存协议[D]。Karger等，(1997)，(1999)。与一致性哈希相比，我们的协议，称为扩展一致性哈希(ECH)，可以更好地处理对对象的闪存访问，并产生更好的最坏情况响应时间和更低的负载方差。由于分布式哈希方案中的客户端视图的多样性，单个对象(或其引用)可能被缓存在多个位置。这被称为物体的扩散。一致性散列将请求映射到缓存，而不考虑所请求对象的分布。另一方面，ECH估计对象的分布，并根据预期的分布随机化请求。在这样做时，它将请求分摊到更多的缓存上。虽然ECH中目标缓存上的预期负载与一致散列(渐近最优)保持相同，但负载方差显著减小。我们提供了分析结果和模拟，以展示查询频繁访问对象的显著改进，在最坏情况下的响应时间提高了80%，服务器/目标缓存负载的差异提高了30%。我们还显示了预期结果和观测结果之间的良好相关性。使ECH特别有吸引力的是，它可以以最小的软件开销集成到基于一致散列的现有基础设施中。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

求助全文

约1分钟内获得全文求助全文

来源期刊

24th International Conference on Distributed Computing Systems, 2004. Proceedings.

自引率

0.00%

发文量