Load Thresholds for Cuckoo Hashing with Overlapping Blocks

IF 1.4 3区计算机科学 Q3 COMPUTER SCIENCE, THEORY & METHODS

ACM Transactions on Algorithms Pub Date : 2023-05-05 DOI:https://dl.acm.org/doi/10.1145/3589558

Stefan Walzer

{"title":"Load Thresholds for Cuckoo Hashing with Overlapping Blocks","authors":"Stefan Walzer","doi":"https://dl.acm.org/doi/10.1145/3589558","DOIUrl":null,"url":null,"abstract":"We consider a natural variation of cuckoo hashing proposed by Lehman and Panigrahy (2009). Each of cn objects is assigned k = 2 intervals of size ℓ in a linear hash table of size n and both starting points are chosen independently and uniformly at random. Each object must be placed into a table cell within its intervals, but each cell can only hold one object. Experiments suggested that this scheme outperforms the variant with blocks in which intervals are aligned at multiples of ℓ. In particular, the load threshold is higher, i.e., the load c that can be achieved with high probability. For instance, Lehman and Panigrahy (2009) empirically observed the threshold for ℓ = 2 to be around 96.5% as compared to roughly 89.7% using blocks. They pinned down the asymptotics of the thresholds for large ℓ, but the precise values resisted rigorous analysis. We establish a method to determine these load thresholds for all ℓ ≥ 2, and, in fact, for general k ≥ 2. For instance, for k = ℓ = 2, we get ≈ 96.4995%. We employ a theorem due to Leconte, Lelarge, and Massoulié (2013), which adapts methods from statistical physics to the world of hypergraph orientability. In effect, the orientability thresholds for our graph families are determined by belief propagation equations for certain graph limits. As a side note, we provide experimental evidence suggesting that placements can be constructed in linear time using an adapted version of an algorithm by Khosla (2013).","PeriodicalId":50922,"journal":{"name":"ACM Transactions on Algorithms","volume":"7 24","pages":""},"PeriodicalIF":1.4000,"publicationDate":"2023-05-05","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"ACM Transactions on Algorithms","FirstCategoryId":"94","ListUrlMain":"https://doi.org/https://dl.acm.org/doi/10.1145/3589558","RegionNum":3,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q3","JCRName":"COMPUTER SCIENCE, THEORY & METHODS","Score":null,"Total":0}

引用次数: 0

Abstract

We consider a natural variation of cuckoo hashing proposed by Lehman and Panigrahy (2009). Each of cn objects is assigned k = 2 intervals of size ℓ in a linear hash table of size n and both starting points are chosen independently and uniformly at random. Each object must be placed into a table cell within its intervals, but each cell can only hold one object. Experiments suggested that this scheme outperforms the variant with blocks in which intervals are aligned at multiples of ℓ. In particular, the load threshold is higher, i.e., the load c that can be achieved with high probability. For instance, Lehman and Panigrahy (2009) empirically observed the threshold for ℓ = 2 to be around 96.5% as compared to roughly 89.7% using blocks. They pinned down the asymptotics of the thresholds for large ℓ, but the precise values resisted rigorous analysis.

We establish a method to determine these load thresholds for all ℓ ≥ 2, and, in fact, for general k ≥ 2. For instance, for k = ℓ = 2, we get ≈ 96.4995%. We employ a theorem due to Leconte, Lelarge, and Massoulié (2013), which adapts methods from statistical physics to the world of hypergraph orientability. In effect, the orientability thresholds for our graph families are determined by belief propagation equations for certain graph limits. As a side note, we provide experimental evidence suggesting that placements can be constructed in linear time using an adapted version of an algorithm by Khosla (2013).

查看原文本刊更多论文

重叠块布谷鸟哈希的负载阈值

我们考虑由Lehman和Panigrahy(2009)提出的布谷鸟哈希的自然变异。在大小为n的线性哈希表中，为每个cn对象分配k = 2个大小为r的区间，并且两个起始点都是独立且均匀随机选择的。每个对象必须在其间隔内放置到表单元格中，但每个单元格只能保存一个对象。实验表明，该方案优于间隔以整数倍的距离排列的块变体。其中，负载阈值更高，即可以大概率实现的负载c。例如，Lehman和Panigrahy(2009)根据经验观察到，与使用区块的大约89.7%的阈值相比，r = 2的阈值约为96.5%。他们确定了大的阈值的渐近性，但是精确的值难以进行严格的分析。我们建立了一种方法来确定这些载荷阈值对于所有的r≥2，事实上，对于一般k≥2。例如，对于k = r = 2，我们得到≈96.4995%。我们采用了Leconte, llarge和massouli(2013)提出的一个定理，该定理将统计物理学的方法应用于超图可定向性的世界。实际上，图族的可定向阈值是由某些图极限的信念传播方程决定的。作为旁注，我们提供的实验证据表明，可以使用Khosla(2013)的算法的改编版本在线性时间内构建位置。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

求助全文

约1分钟内获得全文求助全文

来源期刊

ACM Transactions on Algorithms COMPUTER SCIENCE, THEORY & METHODS-MATHEMATICS, APPLIED

CiteScore

3.30

自引率

0.00%

发文量

审稿时长

6-12 weeks

期刊介绍： ACM Transactions on Algorithms welcomes submissions of original research of the highest quality dealing with algorithms that are inherently discrete and finite, and having mathematical content in a natural way, either in the objective or in the analysis. Most welcome are new algorithms and data structures, new and improved analyses, and complexity results. Specific areas of computation covered by the journal include combinatorial searches and objects; counting; discrete optimization and approximation; randomization and quantum computation; parallel and distributed computation; algorithms for graphs, geometry, arithmetic, number theory, strings; on-line analysis; cryptography; coding; data compression; learning algorithms; methods of algorithmic analysis; discrete algorithms for application areas such as biology, economics, game theory, communication, computer systems and architecture, hardware design, scientific computing