Sampling-based garbage collection metadata management scheme for flash-based storage

2011 IEEE 27th Symposium on Mass Storage Systems and Technologies (MSST) Pub Date : 2011-05-23 DOI:10.1109/MSST.2011.5937228

Biplob K. Debnath, K. Srinivasan, Weijun Xiao, D. Lilja, D. Du

{"title":"Sampling-based garbage collection metadata management scheme for flash-based storage","authors":"Biplob K. Debnath, K. Srinivasan, Weijun Xiao, D. Lilja, D. Du","doi":"10.1109/MSST.2011.5937228","DOIUrl":null,"url":null,"abstract":"Existing garbage collection algorithms for the flash-based storage use score-based heuristics to select victim blocks for reclaiming free space and wear leveling. The score for a block is estimated using metadata information such as age, block utilization, and erase count. To quickly find a victim block, these algorithms maintain a priority queue in the SRAM of the storage controller. This priority queue takes O(K) space, where K stands for flash storage capacity in total number of blocks. As the flash capacity scales to larger size, K also scales to larger value. However, due to higher price per byte, SRAM will not scale proportionately. In this case, due to SRAM scarcity, it will be challenging to implement a larger priority queue in the limited SRAM of a large-capacity flash storage. In addition to space issue, with any update in the metadata information, the priority queue needs to be continuously updated, which takes O(lg(K)) operations. This computation overhead also increases with the increase of flash capacity. In this paper, we have taken a novel approach to solve the garbage collection metadata management problem of a large-capacity flash storage. We propose a sampling-based approach to approximate existing garbage collection algorithms in the limited SRAM space. Since these algorithms are heuristic-based, our sampling-based algorithm will perform as good as unsampled (original) algorithm, if we choose good samples to make garbage collection decisions. We propose a very simple policy to choose samples. Our experimental results show that small number of samples are good enough to emulate existing garbage collection algorithms.","PeriodicalId":136636,"journal":{"name":"2011 IEEE 27th Symposium on Mass Storage Systems and Technologies (MSST)","volume":"69 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2011-05-23","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"7","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"2011 IEEE 27th Symposium on Mass Storage Systems and Technologies (MSST)","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/MSST.2011.5937228","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}

引用次数: 7

Abstract

Existing garbage collection algorithms for the flash-based storage use score-based heuristics to select victim blocks for reclaiming free space and wear leveling. The score for a block is estimated using metadata information such as age, block utilization, and erase count. To quickly find a victim block, these algorithms maintain a priority queue in the SRAM of the storage controller. This priority queue takes O(K) space, where K stands for flash storage capacity in total number of blocks. As the flash capacity scales to larger size, K also scales to larger value. However, due to higher price per byte, SRAM will not scale proportionately. In this case, due to SRAM scarcity, it will be challenging to implement a larger priority queue in the limited SRAM of a large-capacity flash storage. In addition to space issue, with any update in the metadata information, the priority queue needs to be continuously updated, which takes O(lg(K)) operations. This computation overhead also increases with the increase of flash capacity. In this paper, we have taken a novel approach to solve the garbage collection metadata management problem of a large-capacity flash storage. We propose a sampling-based approach to approximate existing garbage collection algorithms in the limited SRAM space. Since these algorithms are heuristic-based, our sampling-based algorithm will perform as good as unsampled (original) algorithm, if we choose good samples to make garbage collection decisions. We propose a very simple policy to choose samples. Our experimental results show that small number of samples are good enough to emulate existing garbage collection algorithms.

查看原文本刊更多论文

基于flash存储的基于抽样的垃圾收集元数据管理方案

现有的基于闪存的垃圾收集算法使用基于分数的启发式方法选择受害者块进行空闲空间回收和磨损均衡。使用元数据信息(如年龄、块利用率和擦除计数)估计块的分数。为了快速找到受害块，这些算法在存储控制器的SRAM中维护一个优先级队列。这个优先级队列占用O(K)空间，其中K代表总块数量中的闪存存储容量。随着闪存容量的增大，K值也随之增大。然而，由于每字节的价格较高，SRAM不会按比例扩展。在这种情况下，由于SRAM的稀缺性，在大容量闪存的有限SRAM中实现更大的优先级队列将是一项挑战。除了空间问题外，元数据信息的任何更新都需要持续更新优先级队列，这需要O(lg(K))次操作。这种计算开销也随着闪存容量的增加而增加。本文采用了一种新颖的方法来解决大容量闪存的垃圾回收元数据管理问题。我们提出了一种基于采样的方法来在有限的SRAM空间中近似现有的垃圾收集算法。由于这些算法是基于启发式的，如果我们选择好的样本来进行垃圾收集决策，我们的基于抽样的算法将与未抽样(原始)算法执行得一样好。我们提出一个非常简单的选择样本的策略。我们的实验结果表明，少量的样本足以模拟现有的垃圾收集算法。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

求助全文

约1分钟内获得全文求助全文

来源期刊

2011 IEEE 27th Symposium on Mass Storage Systems and Technologies (MSST)

自引率

0.00%

发文量