HDFS Heterogeneous Storage Resource Management Based on Data Temperature

Rohith Subramanyam
{"title":"HDFS Heterogeneous Storage Resource Management Based on Data Temperature","authors":"Rohith Subramanyam","doi":"10.1109/ICCAC.2015.33","DOIUrl":null,"url":null,"abstract":"Hadoop has traditionally been used as a large-scale batch processing system. However, interactive applications such as Facebook Messenger are becoming increasingly prominent in the Hadoop world. A key bottleneck in adapting Hadoop to real-time processing is disk data transfer rate. The advent of Solid State Drives (SSDs) holds great promise in this regard as they provide bandwidth on the orders of magnitude better than that of rotating disks. But due to their higher cost per gigabyte, a common approach is to have heterogeneous storage types. This paper presents a Storage Resource Management technique that automatically and dynamically moves data across this tiered storage based on Data Temperature, migrating \"hot\" data towards faster storage and \"cold\" data towards inexpensive archival storage. Thus, the cluster adapts based on the characteristics of the workloads over time to make effective use of the scarce expensive storage. Finally, I evaluate my modified version of the Hadoop Distributed File System (HDFS) against the vanilla version to compare their performances. The results are promising and show an improvement in both read and write performance with a significant improvement in read performance.","PeriodicalId":133491,"journal":{"name":"2015 International Conference on Cloud and Autonomic Computing","volume":"2014 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2015-09-21","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"8","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"2015 International Conference on Cloud and Autonomic Computing","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/ICCAC.2015.33","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 8

Abstract

Hadoop has traditionally been used as a large-scale batch processing system. However, interactive applications such as Facebook Messenger are becoming increasingly prominent in the Hadoop world. A key bottleneck in adapting Hadoop to real-time processing is disk data transfer rate. The advent of Solid State Drives (SSDs) holds great promise in this regard as they provide bandwidth on the orders of magnitude better than that of rotating disks. But due to their higher cost per gigabyte, a common approach is to have heterogeneous storage types. This paper presents a Storage Resource Management technique that automatically and dynamically moves data across this tiered storage based on Data Temperature, migrating "hot" data towards faster storage and "cold" data towards inexpensive archival storage. Thus, the cluster adapts based on the characteristics of the workloads over time to make effective use of the scarce expensive storage. Finally, I evaluate my modified version of the Hadoop Distributed File System (HDFS) against the vanilla version to compare their performances. The results are promising and show an improvement in both read and write performance with a significant improvement in read performance.
基于数据温度的HDFS异构存储资源管理
Hadoop传统上被用作大规模批处理系统。然而,像Facebook Messenger这样的交互式应用程序在Hadoop世界中变得越来越突出。使Hadoop适应实时处理的一个关键瓶颈是磁盘数据传输速率。固态硬盘(ssd)的出现在这方面带来了巨大的希望,因为它们提供的带宽比旋转磁盘好几个数量级。但是由于每千兆字节的成本较高,一种常见的方法是使用异构存储类型。本文提出了一种存储资源管理技术,该技术可以根据数据温度自动动态地在分层存储中移动数据,将“热”数据迁移到更快的存储中,将“冷”数据迁移到便宜的归档存储中。因此,集群将根据工作负载的特征进行调整,以有效地利用稀缺的昂贵存储。最后,我将修改后的Hadoop分布式文件系统(HDFS)与原始版本进行比较,以比较它们的性能。结果很有希望,读性能和写性能都有显著提高。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
求助全文
约1分钟内获得全文 求助全文
来源期刊
自引率
0.00%
发文量
0
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
确定
请完成安全验证×
copy
已复制链接
快去分享给好友吧!
我知道了
右上角分享
点击右上角分享
0
联系我们:info@booksci.cn Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。 Copyright © 2023 布克学术 All rights reserved.
京ICP备2023020795号-1
ghs 京公网安备 11010802042870号
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术官方微信