Reducing Fragmentation via Exploiting Backup History and Cache Knowledge Granting Security

International Journal of Advance Research and Innovative Ideas in Education Pub Date : 2018-02-27 DOI:10.18535/IJECS/V7I2.20

Anushka Santosh Padyal, Kshitija Shashikant Kank, Anuja Kale

{"title":"Reducing Fragmentation via Exploiting Backup History and Cache Knowledge Granting Security","authors":"Anushka Santosh Padyal, Kshitija Shashikant Kank, Anuja Kale","doi":"10.18535/IJECS/V7I2.20","DOIUrl":null,"url":null,"abstract":"Duplicate chunks are eliminated between multiple backups, the chunks of a backup unfortunately become physically scattered in different containers, which is known as fragmentation in backup systems. We observe that the fragmentation comes in two categories of containers: sparse containers and out-of-order containers, which have different negative impacts and require dedicated solutions. During a restore, a majority of chunks in a sparse container are never accessed, and the chunks in an out-of-order container are accessed intermittently. Both of them hurt the restore performance. Increasing the restore cache size alleviates the negative impacts of out-of-order containers, but it is ineffective for sparse containers because they directly amplify read operations. Additionally, the merging operation is required to reclaim sparse containers in the garbage collection after users delete backups. In order to reduce the fragmentation, we propose History-Aware Rewriting algorithm (HAR) and Cache-Aware Filter (CAF). HAR exploits historical information in backup systems to accurately identify and reduce sparse containers, and CAF exploits restore cache knowledge to identify the out-of-order containers that hurt restore performance. To reduce the metadata overhead of the garbage collection, we further propose a Container-Marker Algorithm (CMA) to identify valid containers instead of valid chunks. Although data deduplication brings a lot of benefits, security and privacy concerns arise as users’ sensitive data are susceptible to both insider and outsider attacks.","PeriodicalId":13793,"journal":{"name":"International Journal of Advance Research and Innovative Ideas in Education","volume":"73 1","pages":"1751-1755"},"PeriodicalIF":0.0000,"publicationDate":"2018-02-27","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"International Journal of Advance Research and Innovative Ideas in Education","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.18535/IJECS/V7I2.20","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}

引用次数: 0

Abstract

Duplicate chunks are eliminated between multiple backups, the chunks of a backup unfortunately become physically scattered in different containers, which is known as fragmentation in backup systems. We observe that the fragmentation comes in two categories of containers: sparse containers and out-of-order containers, which have different negative impacts and require dedicated solutions. During a restore, a majority of chunks in a sparse container are never accessed, and the chunks in an out-of-order container are accessed intermittently. Both of them hurt the restore performance. Increasing the restore cache size alleviates the negative impacts of out-of-order containers, but it is ineffective for sparse containers because they directly amplify read operations. Additionally, the merging operation is required to reclaim sparse containers in the garbage collection after users delete backups. In order to reduce the fragmentation, we propose History-Aware Rewriting algorithm (HAR) and Cache-Aware Filter (CAF). HAR exploits historical information in backup systems to accurately identify and reduce sparse containers, and CAF exploits restore cache knowledge to identify the out-of-order containers that hurt restore performance. To reduce the metadata overhead of the garbage collection, we further propose a Container-Marker Algorithm (CMA) to identify valid containers instead of valid chunks. Although data deduplication brings a lot of benefits, security and privacy concerns arise as users’ sensitive data are susceptible to both insider and outsider attacks.

查看原文本刊更多论文

通过利用备份历史和缓存知识授予安全性来减少碎片

在多个备份之间消除重复的块，不幸的是，备份的块在物理上分散在不同的容器中，这在备份系统中称为碎片。我们观察到碎片化分为两类容器:稀疏容器和乱序容器，它们有不同的负面影响，需要专门的解决方案。在恢复期间，稀疏容器中的大部分块永远不会被访问，而乱序容器中的块会被间歇性地访问。它们都损害了恢复性能。增加恢复缓存大小可以减轻乱序容器的负面影响，但对于稀疏容器来说是无效的，因为它们会直接放大读操作。此外，合并操作需要在用户删除备份后回收垃圾收集中的稀疏容器。为了减少碎片化，我们提出了历史感知重写算法(HAR)和缓存感知过滤器(CAF)。HAR利用备份系统中的历史信息来准确地识别和减少稀疏容器，CAF利用恢复缓存知识来识别影响恢复性能的乱序容器。为了减少垃圾收集的元数据开销，我们进一步提出了容器标记算法(Container-Marker Algorithm, CMA)来识别有效的容器而不是有效的块。虽然重复数据删除带来了很多好处，但由于用户的敏感数据容易受到内部和外部攻击，因此出现了安全和隐私问题。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

求助全文

约1分钟内获得全文求助全文

来源期刊

International Journal of Advance Research and Innovative Ideas in Education

自引率

0.00%

发文量