Load Thresholds for Cuckoo Hashing with Overlapping Blocks

IF 0.9 3区 计算机科学 Q3 COMPUTER SCIENCE, THEORY & METHODS
Stefan Walzer
{"title":"Load Thresholds for Cuckoo Hashing with Overlapping Blocks","authors":"Stefan Walzer","doi":"https://dl.acm.org/doi/10.1145/3589558","DOIUrl":null,"url":null,"abstract":"<p>We consider a natural variation of cuckoo hashing proposed by Lehman and Panigrahy (2009). Each of <i>cn</i> objects is assigned <i>k</i> = 2 intervals of size ℓ in a linear hash table of size <i>n</i> and both starting points are chosen independently and uniformly at random. Each object must be placed into a table cell within its intervals, but each cell can only hold one object. Experiments suggested that this scheme outperforms the variant with <i>blocks</i> in which intervals are aligned at multiples of ℓ. In particular, the <i>load threshold</i> is higher, i.e., the load <i>c</i> that can be achieved with high probability. For instance, Lehman and Panigrahy (2009) empirically observed the threshold for ℓ = 2 to be around 96.5% as compared to roughly 89.7% using blocks. They pinned down the asymptotics of the thresholds for large ℓ, but the precise values resisted rigorous analysis. </p><p>We establish a method to determine these load thresholds for all ℓ ≥ 2, and, in fact, for general <i>k</i> ≥ 2. For instance, for <i>k</i> = ℓ = 2, we get ≈ 96.4995%. We employ a theorem due to Leconte, Lelarge, and Massoulié (2013), which adapts methods from statistical physics to the world of hypergraph orientability. In effect, the orientability thresholds for our graph families are determined by belief propagation equations for certain graph limits. As a side note, we provide experimental evidence suggesting that placements can be constructed in linear time using an adapted version of an algorithm by Khosla (2013).</p>","PeriodicalId":50922,"journal":{"name":"ACM Transactions on Algorithms","volume":"7 24","pages":""},"PeriodicalIF":0.9000,"publicationDate":"2023-05-05","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"ACM Transactions on Algorithms","FirstCategoryId":"94","ListUrlMain":"https://doi.org/https://dl.acm.org/doi/10.1145/3589558","RegionNum":3,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q3","JCRName":"COMPUTER SCIENCE, THEORY & METHODS","Score":null,"Total":0}
引用次数: 0

Abstract

We consider a natural variation of cuckoo hashing proposed by Lehman and Panigrahy (2009). Each of cn objects is assigned k = 2 intervals of size ℓ in a linear hash table of size n and both starting points are chosen independently and uniformly at random. Each object must be placed into a table cell within its intervals, but each cell can only hold one object. Experiments suggested that this scheme outperforms the variant with blocks in which intervals are aligned at multiples of ℓ. In particular, the load threshold is higher, i.e., the load c that can be achieved with high probability. For instance, Lehman and Panigrahy (2009) empirically observed the threshold for ℓ = 2 to be around 96.5% as compared to roughly 89.7% using blocks. They pinned down the asymptotics of the thresholds for large ℓ, but the precise values resisted rigorous analysis.

We establish a method to determine these load thresholds for all ℓ ≥ 2, and, in fact, for general k ≥ 2. For instance, for k = ℓ = 2, we get ≈ 96.4995%. We employ a theorem due to Leconte, Lelarge, and Massoulié (2013), which adapts methods from statistical physics to the world of hypergraph orientability. In effect, the orientability thresholds for our graph families are determined by belief propagation equations for certain graph limits. As a side note, we provide experimental evidence suggesting that placements can be constructed in linear time using an adapted version of an algorithm by Khosla (2013).

重叠块布谷鸟哈希的负载阈值
我们考虑由Lehman和Panigrahy(2009)提出的布谷鸟哈希的自然变异。在大小为n的线性哈希表中,为每个cn对象分配k = 2个大小为r的区间,并且两个起始点都是独立且均匀随机选择的。每个对象必须在其间隔内放置到表单元格中,但每个单元格只能保存一个对象。实验表明,该方案优于间隔以整数倍的距离排列的块变体。其中,负载阈值更高,即可以大概率实现的负载c。例如,Lehman和Panigrahy(2009)根据经验观察到,与使用区块的大约89.7%的阈值相比,r = 2的阈值约为96.5%。他们确定了大的阈值的渐近性,但是精确的值难以进行严格的分析。我们建立了一种方法来确定这些载荷阈值对于所有的r≥2,事实上,对于一般k≥2。例如,对于k = r = 2,我们得到≈96.4995%。我们采用了Leconte, llarge和massouli(2013)提出的一个定理,该定理将统计物理学的方法应用于超图可定向性的世界。实际上,图族的可定向阈值是由某些图极限的信念传播方程决定的。作为旁注,我们提供的实验证据表明,可以使用Khosla(2013)的算法的改编版本在线性时间内构建位置。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
求助全文
约1分钟内获得全文 求助全文
来源期刊
ACM Transactions on Algorithms
ACM Transactions on Algorithms COMPUTER SCIENCE, THEORY & METHODS-MATHEMATICS, APPLIED
CiteScore
3.30
自引率
0.00%
发文量
50
审稿时长
6-12 weeks
期刊介绍: ACM Transactions on Algorithms welcomes submissions of original research of the highest quality dealing with algorithms that are inherently discrete and finite, and having mathematical content in a natural way, either in the objective or in the analysis. Most welcome are new algorithms and data structures, new and improved analyses, and complexity results. Specific areas of computation covered by the journal include combinatorial searches and objects; counting; discrete optimization and approximation; randomization and quantum computation; parallel and distributed computation; algorithms for graphs, geometry, arithmetic, number theory, strings; on-line analysis; cryptography; coding; data compression; learning algorithms; methods of algorithmic analysis; discrete algorithms for application areas such as biology, economics, game theory, communication, computer systems and architecture, hardware design, scientific computing
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
确定
请完成安全验证×
copy
已复制链接
快去分享给好友吧!
我知道了
右上角分享
点击右上角分享
0
联系我们:info@booksci.cn Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。 Copyright © 2023 布克学术 All rights reserved.
京ICP备2023020795号-1
ghs 京公网安备 11010802042870号
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术官方微信