Understanding the role of memory subsystem on performance and energy-efficiency of Hadoop applications

Hosein Mohammadi Makrani, Shahab Tabatabaei, S. Rafatirad, H. Homayoun
{"title":"Understanding the role of memory subsystem on performance and energy-efficiency of Hadoop applications","authors":"Hosein Mohammadi Makrani, Shahab Tabatabaei, S. Rafatirad, H. Homayoun","doi":"10.1109/IGCC.2017.8323591","DOIUrl":null,"url":null,"abstract":"The memory subsystem has always been one of the performance bottlenecks in computer systems. Given the large size of data, therefore, the questions of whether Big Data requires big memory and whether main memory subsystem plays an intrinsic role in the performance and energy-efficiency of Big Data are becoming important. In this paper, through a comprehensive real-system experimental analysis of performance, power and resource utilization, we have evaluated main memory characteristic of Hadoop MapReduce, a de facto standard for big data analytics. Through a methodical experimental setup we have analyzed the impact of DRAM capacity, operating frequency, and the number of channels on power and performance to understand the main memory requirements of this important Big Data framework. The characterization results across various Hadoop MapReduce applications from different domains illustrate that Hadoop MapReduce workloads show two distinct behaviors of being either CPU-intensive or Disk-intensive. Our experimental results showed that DRAM frequency as well as number of channels do not play a significant role on the performance of Hadoop workloads. On the other hand, our results indicate that increasing the number of DRAM channels reduces DRAM power and improves the energy-efficiency of Hadoop MapReduce applications.","PeriodicalId":133239,"journal":{"name":"2017 Eighth International Green and Sustainable Computing Conference (IGSC)","volume":"1 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2017-10-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"14","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"2017 Eighth International Green and Sustainable Computing Conference (IGSC)","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/IGCC.2017.8323591","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 14

Abstract

The memory subsystem has always been one of the performance bottlenecks in computer systems. Given the large size of data, therefore, the questions of whether Big Data requires big memory and whether main memory subsystem plays an intrinsic role in the performance and energy-efficiency of Big Data are becoming important. In this paper, through a comprehensive real-system experimental analysis of performance, power and resource utilization, we have evaluated main memory characteristic of Hadoop MapReduce, a de facto standard for big data analytics. Through a methodical experimental setup we have analyzed the impact of DRAM capacity, operating frequency, and the number of channels on power and performance to understand the main memory requirements of this important Big Data framework. The characterization results across various Hadoop MapReduce applications from different domains illustrate that Hadoop MapReduce workloads show two distinct behaviors of being either CPU-intensive or Disk-intensive. Our experimental results showed that DRAM frequency as well as number of channels do not play a significant role on the performance of Hadoop workloads. On the other hand, our results indicate that increasing the number of DRAM channels reduces DRAM power and improves the energy-efficiency of Hadoop MapReduce applications.
了解内存子系统在Hadoop应用程序性能和能效方面的作用
内存子系统一直是计算机系统的性能瓶颈之一。因此,在数据量大的情况下,大数据是否需要大内存以及主存子系统对大数据的性能和能效是否具有内在作用的问题变得越来越重要。本文通过对性能、功耗和资源利用率进行全面的实系统实验分析,对大数据分析事实上的标准Hadoop MapReduce的主内存特性进行了评估。通过系统的实验设置,我们分析了DRAM容量、工作频率和通道数量对功率和性能的影响,以了解这个重要的大数据框架的主要内存需求。来自不同领域的各种Hadoop MapReduce应用程序的表征结果表明,Hadoop MapReduce工作负载显示出cpu密集型和磁盘密集型两种不同的行为。我们的实验结果表明,DRAM频率和通道数量对Hadoop工作负载的性能没有显著影响。另一方面,我们的结果表明,增加DRAM通道的数量可以降低DRAM功耗,提高Hadoop MapReduce应用程序的能效。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
求助全文
约1分钟内获得全文 求助全文
来源期刊
自引率
0.00%
发文量
0
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
确定
请完成安全验证×
copy
已复制链接
快去分享给好友吧!
我知道了
右上角分享
点击右上角分享
0
联系我们:info@booksci.cn Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。 Copyright © 2023 布克学术 All rights reserved.
京ICP备2023020795号-1
ghs 京公网安备 11010802042870号
Book学术文献互助
Book学术文献互助群
群 号:604180095
Book学术官方微信