Improve Restore Speed in Deduplication Systems Using Segregated Cache

2016 IEEE 24th International Symposium on Modeling, Analysis and Simulation of Computer and Telecommunication Systems (MASCOTS) Pub Date : 2016-09-01 DOI:10.1109/MASCOTS.2016.46

Wenjie Liu, Ping-Hsiu Huang, Tao Lu, Xubin He, Hua Wang, Ke Zhou

{"title":"Improve Restore Speed in Deduplication Systems Using Segregated Cache","authors":"Wenjie Liu, Ping-Hsiu Huang, Tao Lu, Xubin He, Hua Wang, Ke Zhou","doi":"10.1109/MASCOTS.2016.46","DOIUrl":null,"url":null,"abstract":"The chunk fragmentation problem inherently associated with deduplication systems significantly slows down the restore performance, as it causes the restore process to assemble chunks which are distributed in a large number of containers as a result of storage indirection. Existing solutions attempting to address the fragmentation problem either sacrifice deduplication efficiency or require additional memory resources. In this work, we propose a new restore cache scheme, which accelerates the restore process using the same amount of cache space as that of the traditional LRU restore cache. We leverage the recipe knowledge to recognize the containers which will soon be accessed for restoring a backup version and classify those containers into bursty containers which are differentiated from other regular containers. Bursty and regular containers are then put in two separate caches, respectively. Bursty containers, containing many chunks that will be needed for restore within a short period of time, are put in a smaller cache managed at the container granularity. On the contrary, regular containers are put in the other bigger cache managed at the chunk granularity, with chunks which will not be used dropped off at the time when the containers are brought in. In doing so, bursty containers have better chances to be quickly evicted from the restore cache, avoiding their unnecessarily occupying cache space for too long. Our evaluation results have demonstrated that our proposed cache scheme can improve restore speed factor by up to 3.05X and reduce the number of container reads by 67.3% on average, relative to a conventional LRU restore cache.","PeriodicalId":129389,"journal":{"name":"2016 IEEE 24th International Symposium on Modeling, Analysis and Simulation of Computer and Telecommunication Systems (MASCOTS)","volume":"9 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2016-09-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"4","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"2016 IEEE 24th International Symposium on Modeling, Analysis and Simulation of Computer and Telecommunication Systems (MASCOTS)","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/MASCOTS.2016.46","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}

引用次数: 4

Abstract

The chunk fragmentation problem inherently associated with deduplication systems significantly slows down the restore performance, as it causes the restore process to assemble chunks which are distributed in a large number of containers as a result of storage indirection. Existing solutions attempting to address the fragmentation problem either sacrifice deduplication efficiency or require additional memory resources. In this work, we propose a new restore cache scheme, which accelerates the restore process using the same amount of cache space as that of the traditional LRU restore cache. We leverage the recipe knowledge to recognize the containers which will soon be accessed for restoring a backup version and classify those containers into bursty containers which are differentiated from other regular containers. Bursty and regular containers are then put in two separate caches, respectively. Bursty containers, containing many chunks that will be needed for restore within a short period of time, are put in a smaller cache managed at the container granularity. On the contrary, regular containers are put in the other bigger cache managed at the chunk granularity, with chunks which will not be used dropped off at the time when the containers are brought in. In doing so, bursty containers have better chances to be quickly evicted from the restore cache, avoiding their unnecessarily occupying cache space for too long. Our evaluation results have demonstrated that our proposed cache scheme can improve restore speed factor by up to 3.05X and reduce the number of container reads by 67.3% on average, relative to a conventional LRU restore cache.

查看原文本刊更多论文

使用隔离缓存提高重删系统的恢复速度

与重复数据删除系统固有关联的块碎片问题显著降低了恢复性能，因为它导致恢复过程组装由于存储间接而分布在大量容器中的块。试图解决碎片问题的现有解决方案要么牺牲重复数据删除效率，要么需要额外的内存资源。在这项工作中，我们提出了一种新的恢复缓存方案，该方案在使用与传统LRU恢复缓存相同的缓存空间的情况下加速了恢复过程。我们利用配方知识来识别用于恢复备份版本即将访问的容器，并将这些容器分类为与其他常规容器不同的突发容器。然后将爆裂容器和普通容器分别放入两个单独的缓存中。突发容器(包含在短时间内恢复所需的许多块)被放在按容器粒度管理的较小缓存中。相反，常规容器被放在按块粒度管理的另一个更大的缓存中，不使用的块在容器被引入时被丢弃。这样，突发容器就有更好的机会从恢复缓存中被快速逐出，从而避免它们不必要地长时间占用缓存空间。我们的评估结果表明，与传统的LRU恢复缓存相比，我们提出的缓存方案可以将恢复速度因子提高3.05倍，平均减少67.3%的容器读取次数。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

求助全文

约1分钟内获得全文求助全文

来源期刊

2016 IEEE 24th International Symposium on Modeling, Analysis and Simulation of Computer and Telecommunication Systems (MASCOTS)

自引率

0.00%

发文量