Thermal and power-aware task scheduling for Hadoop based storage centric datacenters

International Conference on Green Computing Pub Date : 2010-08-15 DOI:10.1109/GREENCOMP.2010.5598262

Bing Shi, Ankur Srivastava

{"title":"Thermal and power-aware task scheduling for Hadoop based storage centric datacenters","authors":"Bing Shi, Ankur Srivastava","doi":"10.1109/GREENCOMP.2010.5598262","DOIUrl":null,"url":null,"abstract":"Apache Hadoop is a framework for managing large scale storage based datacenters whose primary job is to deliver data to clients. In such systems, the primary job is to associate each data request to a specific data replica among many available replicas. This assignment impacts the workload and power distribution across the storage servers. In this paper, we explore thermal and power aware task scheduling for Hadoop based storage centric datacenters. In order to maintain the reliability of datacenters, we would like to make sure that each node in the datacenter operates at a temperature below a certain temperature threshold. At the same time, we would like to minimize the total power consumption in the air conditioning (A/C) system that provides the cooling for maintaining the temperature. We formulate the resultant optimization problem as an Integer Linear Programming problem and develop minimum cost flow based heuristic to solve the problem. The experimental result shows that, our method forces the A/C system to output air temperature only 0.69K lower on average compared to the optimal ILP solution. However, the runtime of our method is only 1%–2.5% of the runtime using ILP solver. Also, random selection of data replica for each data request results in the required A/C output air temperature to be 6.35K lower than our method, which forces the A/C system to work harder.","PeriodicalId":262148,"journal":{"name":"International Conference on Green Computing","volume":"11 3 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2010-08-15","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"22","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"International Conference on Green Computing","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/GREENCOMP.2010.5598262","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}

引用次数: 22

Abstract

Apache Hadoop is a framework for managing large scale storage based datacenters whose primary job is to deliver data to clients. In such systems, the primary job is to associate each data request to a specific data replica among many available replicas. This assignment impacts the workload and power distribution across the storage servers. In this paper, we explore thermal and power aware task scheduling for Hadoop based storage centric datacenters. In order to maintain the reliability of datacenters, we would like to make sure that each node in the datacenter operates at a temperature below a certain temperature threshold. At the same time, we would like to minimize the total power consumption in the air conditioning (A/C) system that provides the cooling for maintaining the temperature. We formulate the resultant optimization problem as an Integer Linear Programming problem and develop minimum cost flow based heuristic to solve the problem. The experimental result shows that, our method forces the A/C system to output air temperature only 0.69K lower on average compared to the optimal ILP solution. However, the runtime of our method is only 1%–2.5% of the runtime using ILP solver. Also, random selection of data replica for each data request results in the required A/C output air temperature to be 6.35K lower than our method, which forces the A/C system to work harder.

查看原文本刊更多论文

基于Hadoop的以存储为中心的数据中心的热和功耗感知任务调度

Apache Hadoop是一个框架，用于管理基于大规模存储的数据中心，其主要工作是向客户端交付数据。在这样的系统中，主要工作是将每个数据请求关联到许多可用副本中的特定数据副本。这种分配会影响存储服务器之间的工作负载和功率分配。在本文中，我们探讨了基于Hadoop的以存储为中心的数据中心的热感知和功耗感知任务调度。为了保持数据中心的可靠性，我们希望确保数据中心的每个节点在低于某个温度阈值的温度下运行。同时，我们希望最大限度地减少空调(A/C)系统的总功耗，以提供冷却以保持温度。我们将结果优化问题化为一个整数线性规划问题，并开发了基于最小成本流的启发式方法来解决该问题。实验结果表明，我们的方法使空调系统输出的空气温度仅比最优的ILP方案平均低0.69K。然而，我们的方法运行时间仅为使用ILP求解器运行时间的1%-2.5%。此外，随机选择每个数据请求的数据副本导致所需的A/C输出空气温度比我们的方法低6.35K，这迫使A/C系统更加努力地工作。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

求助全文

约1分钟内获得全文求助全文

来源期刊

International Conference on Green Computing

自引率

0.00%

发文量