CombiHeader: Minimizing the number of shim headers in redundancy elimination systems

Sumanta Saha, Andrey Lukyanenko, Antti Yla-Jaaski
{"title":"CombiHeader: Minimizing the number of shim headers in redundancy elimination systems","authors":"Sumanta Saha, Andrey Lukyanenko, Antti Yla-Jaaski","doi":"10.1109/INFCOMW.2011.5928920","DOIUrl":null,"url":null,"abstract":"Redundancy elimination has been used in many places to improve network performance. The algorithms for doing this typically split data into chunks, fingerprint them, and compare the fingerprint with cache to identify similar chunks. Then these chunks are removed from the data and headers are inserted instead of them. However, this approach presents us with two crucial shortcomings. Depending on the size of chunks, either many headers need to be inserted, or probability of missing similar regions is increased. Algorithms that try to overcome missed similarity detection by expanding chunk boundary suffers from excessive memory access due to byte-by-byte comparison. This situation leads us to propose a novel algorithm, CombiHeader, that allows near maximum similarity detection using smaller chunks sizes while using chunk aggregation technique to transmit very few headers with few memory accesses. CombiHeader uses a specialized directed graph to track and merge adjacent popular chunks. By generating different generations of CombiNodes, CombiHeader can detect different lengths of similarity region, and uses the smallest number of headers possible. Experiments show that CombiHeader uses less than 25% headers than general elimination algorithms, and this number improves with the number of hits. The required memory access to detect maximal similarity region is in the range of 1%-5% of comparable algorithms for certain situations. CombiHeader is implemented as a pluggable module, which can be used with any existing redundancy elimination algorithm.","PeriodicalId":402219,"journal":{"name":"2011 IEEE Conference on Computer Communications Workshops (INFOCOM WKSHPS)","volume":"5 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2011-04-10","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"10","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"2011 IEEE Conference on Computer Communications Workshops (INFOCOM WKSHPS)","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/INFCOMW.2011.5928920","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 10

Abstract

Redundancy elimination has been used in many places to improve network performance. The algorithms for doing this typically split data into chunks, fingerprint them, and compare the fingerprint with cache to identify similar chunks. Then these chunks are removed from the data and headers are inserted instead of them. However, this approach presents us with two crucial shortcomings. Depending on the size of chunks, either many headers need to be inserted, or probability of missing similar regions is increased. Algorithms that try to overcome missed similarity detection by expanding chunk boundary suffers from excessive memory access due to byte-by-byte comparison. This situation leads us to propose a novel algorithm, CombiHeader, that allows near maximum similarity detection using smaller chunks sizes while using chunk aggregation technique to transmit very few headers with few memory accesses. CombiHeader uses a specialized directed graph to track and merge adjacent popular chunks. By generating different generations of CombiNodes, CombiHeader can detect different lengths of similarity region, and uses the smallest number of headers possible. Experiments show that CombiHeader uses less than 25% headers than general elimination algorithms, and this number improves with the number of hits. The required memory access to detect maximal similarity region is in the range of 1%-5% of comparable algorithms for certain situations. CombiHeader is implemented as a pluggable module, which can be used with any existing redundancy elimination algorithm.
CombiHeader:最小化冗余消除系统中垫片头的数量
冗余消除已在许多地方用于提高网络性能。执行此操作的算法通常将数据分成块,对它们进行指纹识别,并将指纹与缓存进行比较,以识别相似的块。然后从数据中删除这些块,并插入标题代替它们。然而,这种方法给我们带来了两个关键的缺点。根据块的大小,要么需要插入许多头,要么增加丢失类似区域的概率。试图通过扩展块边界来克服遗漏的相似性检测的算法由于逐字节比较而遭受过多的内存访问。这种情况导致我们提出了一种新的算法,CombiHeader,它允许使用较小的块大小进行接近最大的相似性检测,同时使用块聚合技术以很少的内存访问传输很少的头。CombiHeader使用专门的有向图来跟踪和合并相邻的流行块。通过生成不同代的combinode, CombiHeader可以检测不同长度的相似区域,并使用尽可能少的头。实验表明,CombiHeader比一般消除算法使用不到25%的头,并且这个数字随着命中次数的增加而提高。在某些情况下,检测最大相似区域所需的内存访问在可比算法的1%-5%的范围内。CombiHeader实现为可插拔模块,可以与任何现有的冗余消除算法一起使用。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
求助全文
约1分钟内获得全文 求助全文
来源期刊
自引率
0.00%
发文量
0
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
确定
请完成安全验证×
copy
已复制链接
快去分享给好友吧!
我知道了
右上角分享
点击右上角分享
0
联系我们:info@booksci.cn Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。 Copyright © 2023 布克学术 All rights reserved.
京ICP备2023020795号-1
ghs 京公网安备 11010802042870号
Book学术文献互助
Book学术文献互助群
群 号:604180095
Book学术官方微信