Investigating hybrid SSD FTL schemes for Hadoop workloads

ACM International Conference on Computing Frontiers Pub Date : 2013-05-14 DOI:10.1145/2482767.2482793

Hyeran Jeon, Kaoutar El Maghraoui, G. Kandiraju

{"title":"Investigating hybrid SSD FTL schemes for Hadoop workloads","authors":"Hyeran Jeon, Kaoutar El Maghraoui, G. Kandiraju","doi":"10.1145/2482767.2482793","DOIUrl":null,"url":null,"abstract":"The Flash Translation Layer (FTL) is the core engine for Solid State Disks (SSD). It is responsible for managing the virtual to physical address mappings and emulating the functionality of a normal block-level device. SSD performance is highly dependent on the design of the FTL. For the last few years, several FTL schemes have been proposed. Hybrid FTL schemes have gained more popularity since they try to combine the benefits of both page-level mapping and block-level mapping schemes. Examples include BAST, FAST, LAST, etc. To provide high performance, FTL designers face several cross cutting issues: the right balance between coarse and fine grain address mapping, the asymmetric nature of reads and writes, the write amplification property of Flash memory, and the wear-out behavior of flash.\n The MapReduce paradigm has become a very popular paradigm for performing parallel and distributed computations on large data. Hadoop, an open-source implementation of MapReduce, has accelerated MapReduce adoption. Flash SSD is increasingly being used as a storage solution in Hadoop deployments for faster processing and better energy utilization. Little work has been done to understand the endurance implications of SSD on Hadoop-based workloads. In this paper, using a highly flexible and reconfigurable kernel-level simulation infrastructure, we investigate the internal characteristics of various hybrid FTL schemes using a representative set of Hadoop workloads. Our investigation brings out the wear-out behavior of SSD for Hadoop-based workloads including wear-leveling details, garbage collection, translation and block/page mappings, and advocates the need for dynamic tuning of FTL parameters for these workloads.","PeriodicalId":430420,"journal":{"name":"ACM International Conference on Computing Frontiers","volume":"1 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2013-05-14","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"14","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"ACM International Conference on Computing Frontiers","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1145/2482767.2482793","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}

引用次数: 14

Abstract

The Flash Translation Layer (FTL) is the core engine for Solid State Disks (SSD). It is responsible for managing the virtual to physical address mappings and emulating the functionality of a normal block-level device. SSD performance is highly dependent on the design of the FTL. For the last few years, several FTL schemes have been proposed. Hybrid FTL schemes have gained more popularity since they try to combine the benefits of both page-level mapping and block-level mapping schemes. Examples include BAST, FAST, LAST, etc. To provide high performance, FTL designers face several cross cutting issues: the right balance between coarse and fine grain address mapping, the asymmetric nature of reads and writes, the write amplification property of Flash memory, and the wear-out behavior of flash. The MapReduce paradigm has become a very popular paradigm for performing parallel and distributed computations on large data. Hadoop, an open-source implementation of MapReduce, has accelerated MapReduce adoption. Flash SSD is increasingly being used as a storage solution in Hadoop deployments for faster processing and better energy utilization. Little work has been done to understand the endurance implications of SSD on Hadoop-based workloads. In this paper, using a highly flexible and reconfigurable kernel-level simulation infrastructure, we investigate the internal characteristics of various hybrid FTL schemes using a representative set of Hadoop workloads. Our investigation brings out the wear-out behavior of SSD for Hadoop-based workloads including wear-leveling details, garbage collection, translation and block/page mappings, and advocates the need for dynamic tuning of FTL parameters for these workloads.

查看原文本刊更多论文

研究Hadoop工作负载的混合SSD FTL方案

FTL (Flash Translation Layer)是SSD (Solid State Disks)的核心引擎。它负责管理虚拟地址到物理地址的映射，并模拟正常块级设备的功能。SSD的性能在很大程度上取决于FTL的设计。在过去的几年里，已经提出了几个超光速方案。由于混合FTL方案试图结合页面级映射和块级映射方案的优点，因此它获得了更大的普及。例子包括BAST、FAST、LAST等。为了提供高性能，FTL设计者面临几个交叉问题:粗粒度和细粒度地址映射之间的正确平衡，读写的不对称性质，闪存的写入放大特性，以及闪存的磨损行为。MapReduce范式已经成为在大数据上执行并行和分布式计算的一种非常流行的范式。Hadoop是MapReduce的开源实现，它加速了MapReduce的采用。Flash SSD越来越多地被用作Hadoop部署中的存储解决方案，以实现更快的处理和更好的能源利用。在理解SSD对基于hadoop的工作负载的持久性影响方面，人们做的工作很少。在本文中，使用高度灵活和可重构的内核级模拟基础设施，我们使用一组具有代表性的Hadoop工作负载来研究各种混合FTL方案的内部特征。我们的调查揭示了基于hadoop的工作负载的SSD损耗行为，包括损耗均衡细节、垃圾收集、转换和块/页映射，并主张需要动态调整这些工作负载的FTL参数。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

求助全文

约1分钟内获得全文求助全文

来源期刊

ACM International Conference on Computing Frontiers

自引率

0.00%

发文量