Hadoop工作负载特性在微型服务器上的性能和能效优化

Maria Malik;Katayoun Neshatpour;Setareh Rafatirad;Houman Homayoun
{"title":"Hadoop工作负载特性在微型服务器上的性能和能效优化","authors":"Maria Malik;Katayoun Neshatpour;Setareh Rafatirad;Houman Homayoun","doi":"10.1109/TMSCS.2017.2749228","DOIUrl":null,"url":null,"abstract":"The traditional low-power embedded processors such as Atom and ARM are entering into the high-performance server market. At the same time, big data analytics applications are emerging and dramatically changing the landscape of data center workloads. Emerging big data applications require a significant amount of server computational power. However, the rapid growth in the data yields challenges to process them efficiently using current high-performance server architectures. Furthermore, physical design constraints, such as power and density have become the dominant limiting factor for scaling out servers. Numerous big data applications rely on using Hadoop MapReduce framework to perform their analysis on large-scale datasets. Since Hadoop configuration parameters as well as system parameters directly affect the MapReduce job performance and energy-efficiency, joint application, system, and architecture level parameters tuning is vital to maximize the energy efficiency for Hadoop-based applications. In this work, through methodical investigation of performance and power measurements, we demonstrate how the interplay among various Hadoop configuration parameters, as well as system and architecture level parameters affect not only the performance but also the energy-efficiency across various big data applications. Our results identify trends to guide scheduling decision and key insights to help improving Hadoop MapReduce applications performance, power, and energy-efficiency on microservers.","PeriodicalId":100643,"journal":{"name":"IEEE Transactions on Multi-Scale Computing Systems","volume":"4 3","pages":"355-368"},"PeriodicalIF":0.0000,"publicationDate":"2017-09-05","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://sci-hub-pdf.com/10.1109/TMSCS.2017.2749228","citationCount":"10","resultStr":"{\"title\":\"Hadoop Workloads Characterization for Performance and Energy Efficiency Optimizations on Microservers\",\"authors\":\"Maria Malik;Katayoun Neshatpour;Setareh Rafatirad;Houman Homayoun\",\"doi\":\"10.1109/TMSCS.2017.2749228\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"The traditional low-power embedded processors such as Atom and ARM are entering into the high-performance server market. At the same time, big data analytics applications are emerging and dramatically changing the landscape of data center workloads. Emerging big data applications require a significant amount of server computational power. However, the rapid growth in the data yields challenges to process them efficiently using current high-performance server architectures. Furthermore, physical design constraints, such as power and density have become the dominant limiting factor for scaling out servers. Numerous big data applications rely on using Hadoop MapReduce framework to perform their analysis on large-scale datasets. Since Hadoop configuration parameters as well as system parameters directly affect the MapReduce job performance and energy-efficiency, joint application, system, and architecture level parameters tuning is vital to maximize the energy efficiency for Hadoop-based applications. In this work, through methodical investigation of performance and power measurements, we demonstrate how the interplay among various Hadoop configuration parameters, as well as system and architecture level parameters affect not only the performance but also the energy-efficiency across various big data applications. Our results identify trends to guide scheduling decision and key insights to help improving Hadoop MapReduce applications performance, power, and energy-efficiency on microservers.\",\"PeriodicalId\":100643,\"journal\":{\"name\":\"IEEE Transactions on Multi-Scale Computing Systems\",\"volume\":\"4 3\",\"pages\":\"355-368\"},\"PeriodicalIF\":0.0000,\"publicationDate\":\"2017-09-05\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"https://sci-hub-pdf.com/10.1109/TMSCS.2017.2749228\",\"citationCount\":\"10\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"IEEE Transactions on Multi-Scale Computing Systems\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://ieeexplore.ieee.org/document/8025821/\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"IEEE Transactions on Multi-Scale Computing Systems","FirstCategoryId":"1085","ListUrlMain":"https://ieeexplore.ieee.org/document/8025821/","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 10

摘要

Atom和ARM等传统的低功耗嵌入式处理器正在进入高性能服务器市场。与此同时,大数据分析应用程序正在兴起,并极大地改变了数据中心工作负载的格局。新兴的大数据应用程序需要大量的服务器计算能力。然而,数据的快速增长带来了使用当前高性能服务器架构高效处理数据的挑战。此外,物理设计约束,如功率和密度,已成为扩展服务器的主要限制因素。许多大数据应用程序都依赖于使用Hadoop MapReduce框架在大型数据集上进行分析。由于Hadoop配置参数和系统参数直接影响MapReduce作业性能和能源效率,联合应用程序、系统和架构级别的参数调整对于最大限度地提高基于Hadoop的应用程序的能源效率至关重要。在这项工作中,通过对性能和功率测量的系统研究,我们展示了各种Hadoop配置参数以及系统和架构级别参数之间的相互作用如何不仅影响性能,而且影响各种大数据应用程序的能效。我们的研究结果确定了指导调度决策的趋势和关键见解,以帮助提高Hadoop MapReduce应用程序在微服务器上的性能、功率和能源效率。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
Hadoop Workloads Characterization for Performance and Energy Efficiency Optimizations on Microservers
The traditional low-power embedded processors such as Atom and ARM are entering into the high-performance server market. At the same time, big data analytics applications are emerging and dramatically changing the landscape of data center workloads. Emerging big data applications require a significant amount of server computational power. However, the rapid growth in the data yields challenges to process them efficiently using current high-performance server architectures. Furthermore, physical design constraints, such as power and density have become the dominant limiting factor for scaling out servers. Numerous big data applications rely on using Hadoop MapReduce framework to perform their analysis on large-scale datasets. Since Hadoop configuration parameters as well as system parameters directly affect the MapReduce job performance and energy-efficiency, joint application, system, and architecture level parameters tuning is vital to maximize the energy efficiency for Hadoop-based applications. In this work, through methodical investigation of performance and power measurements, we demonstrate how the interplay among various Hadoop configuration parameters, as well as system and architecture level parameters affect not only the performance but also the energy-efficiency across various big data applications. Our results identify trends to guide scheduling decision and key insights to help improving Hadoop MapReduce applications performance, power, and energy-efficiency on microservers.
求助全文
通过发布文献求助,成功后即可免费获取论文全文。 去求助
来源期刊
自引率
0.00%
发文量
0
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
确定
请完成安全验证×
copy
已复制链接
快去分享给好友吧!
我知道了
右上角分享
点击右上角分享
0
联系我们:info@booksci.cn Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。 Copyright © 2023 布克学术 All rights reserved.
京ICP备2023020795号-1
ghs 京公网安备 11010802042870号
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术官方微信