Hot data identification for flash-based storage systems using multiple bloom filters

2011 IEEE 27th Symposium on Mass Storage Systems and Technologies (MSST) Pub Date : 2011-05-23 DOI:10.1109/MSST.2011.5937216

Dongchul Park, D. Du

{"title":"Hot data identification for flash-based storage systems using multiple bloom filters","authors":"Dongchul Park, D. Du","doi":"10.1109/MSST.2011.5937216","DOIUrl":null,"url":null,"abstract":"Hot data identification can be applied to a variety of fields. Particularly in flash memory, it has a critical impact on its performance (due to a garbage collection) as well as its life span (due to a wear leveling). Although the hot data identification is an issue of paramount importance in flash memory, little investigation has been made. Moreover, all existing schemes focus almost exclusively on a frequency viewpoint. However, recency also must be considered equally with the frequency for effective hot data identification. In this paper, we propose a novel hot data identification scheme adopting multiple bloom filters to efficiently capture finer-grained recency as well as frequency. In addition to this scheme, we propose a Window-based Direct Address Counting (WDAC) algorithm to approximate an ideal hot data identification as our baseline. Unlike the existing baseline algorithm that cannot appropriately capture recency information due to its exponential batch decay, our WDAC algorithm, using a sliding window concept, can capture very fine-grained recency information. Our experimental evaluation with diverse realistic workloads including real SSD traces demonstrates that our multiple bloom filter-based scheme outperforms the state-of-the-art scheme. In particular, ours not only consumes 50% less memory and requires less computational overhead up to 58%, but also improves its performance up to 65%.","PeriodicalId":136636,"journal":{"name":"2011 IEEE 27th Symposium on Mass Storage Systems and Technologies (MSST)","volume":"123 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2011-05-23","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"142","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"2011 IEEE 27th Symposium on Mass Storage Systems and Technologies (MSST)","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/MSST.2011.5937216","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}

引用次数: 142

Abstract

Hot data identification can be applied to a variety of fields. Particularly in flash memory, it has a critical impact on its performance (due to a garbage collection) as well as its life span (due to a wear leveling). Although the hot data identification is an issue of paramount importance in flash memory, little investigation has been made. Moreover, all existing schemes focus almost exclusively on a frequency viewpoint. However, recency also must be considered equally with the frequency for effective hot data identification. In this paper, we propose a novel hot data identification scheme adopting multiple bloom filters to efficiently capture finer-grained recency as well as frequency. In addition to this scheme, we propose a Window-based Direct Address Counting (WDAC) algorithm to approximate an ideal hot data identification as our baseline. Unlike the existing baseline algorithm that cannot appropriately capture recency information due to its exponential batch decay, our WDAC algorithm, using a sliding window concept, can capture very fine-grained recency information. Our experimental evaluation with diverse realistic workloads including real SSD traces demonstrates that our multiple bloom filter-based scheme outperforms the state-of-the-art scheme. In particular, ours not only consumes 50% less memory and requires less computational overhead up to 58%, but also improves its performance up to 65%.

查看原文本刊更多论文

基于闪存的存储系统热数据识别使用多个布隆过滤器

热数据识别可以应用于各种领域。特别是在闪存中，它对其性能(由于垃圾收集)和寿命(由于磨损均衡)有关键影响。热数据识别是快闪存储器研究的热点问题，但目前研究较少。此外，所有现有方案几乎都只关注频率视点。但是，为了有效地识别热数据，还必须将近时性与频率同等考虑。在本文中，我们提出了一种新的热数据识别方案，采用多个布隆滤波器来有效地捕获细粒度的近时性和频率。除此之外，我们还提出了一种基于窗口的直接地址计数(WDAC)算法，以近似理想的热数据识别作为我们的基线。现有的基线算法由于其指数批衰减而不能适当地捕获近时性信息，而我们的WDAC算法使用滑动窗口概念，可以捕获非常细粒度的近时性信息。我们对各种实际工作负载(包括真实的SSD跟踪)进行的实验评估表明，我们基于多布隆滤波器的方案优于最先进的方案。特别是，我们的系统不仅消耗的内存减少了50%，所需的计算开销减少了58%，而且性能提高了65%。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

求助全文

约1分钟内获得全文求助全文

来源期刊

2011 IEEE 27th Symposium on Mass Storage Systems and Technologies (MSST)

自引率

0.00%

发文量