{"title":"Energy-efficient sorting using solid state disks","authors":"A. Beckmann, U. Meyer, P. Sanders, J. Singler","doi":"10.1109/GREENCOMP.2010.5598309","DOIUrl":"https://doi.org/10.1109/GREENCOMP.2010.5598309","url":null,"abstract":"We take sorting of large data sets as case study for making data-intensive applications more energy-efficient. Using a low-power processor, solid state disks, and efficient algorithms, we beat the current records in the JouleSort benchmark for 10GB to 1 TB of data by factors of up to 5.1. Since we also use parallel processing, this usually comes without a performance penalty.","PeriodicalId":262148,"journal":{"name":"International Conference on Green Computing","volume":"13 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2010-08-15","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"127785819","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Reduction of leakage energy in low level caches","authors":"Tomoaki Ukezono, Kiyofumi Tanaka","doi":"10.1109/GREENCOMP.2010.5598268","DOIUrl":"https://doi.org/10.1109/GREENCOMP.2010.5598268","url":null,"abstract":"Recently, leakage energy in cache memories is growing. In past studies, techniques that reduce leakage energy in cache memories by partial inactivation, or techniques that find cache areas to be inactivated were proposed. In this paper, we discuss temporal locality in multi-level caches. Then we propose a technique that reduces leakage energy in low level (L2) caches by using a dynamic optimization system. In the proposed technique, the dynamic optimization system first detects load/store instructions that exhibit no temporal locality in low level (L2) caches. The detected load/store instructions are then replaced with new instructions. When the new instructions cause a miss in L2 caches, the requested block is loaded only on L1 caches and the corresponding cache block in L2 caches is turned off. (Inclusion property is supposed.) The evaluation results for 19 programs in SPEC CPU 2000 benchmarks showed that the proposed technique could reduce leakage energy in L2 cache memories by up to 94.04%, or by 52.10% on average.","PeriodicalId":262148,"journal":{"name":"International Conference on Green Computing","volume":"34 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2010-08-15","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"128066522","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Dynamic Partitioned Global Address Spaces for power efficient DRAM virtualization","authors":"Jeffrey S. Young, S. Yalamanchili","doi":"10.1109/GREENCOMP.2010.5598278","DOIUrl":"https://doi.org/10.1109/GREENCOMP.2010.5598278","url":null,"abstract":"Dynamic Partitioned Global Address Spaces (DPGAS) is an abstraction that allows for quick and efficient remapping of physical memory addresses within a global address space, enabling more efficient sharing of remote DRAM. While past work has proposed several uses for DPGAS [1], the most pressing issue in today's data centers is reducing power. This work uses a detailed simulation infrastructure to study the effects of using DPGAS to reduce overall data center power through low-latency accesses to “virtual” DIMMs. Virtual DIMMs are remote DIMMs that can be mapped into a local node's address space using existing operating system abstractions and low-level hardware support to abstract the DIMM's location from the application using it. By using a simple spill-receive memory allocation model, we show that DPGAS can reduce memory power from 18% to 49% with a hardware latency of 1 to 2 µs in typical usage scenarios. Additionally, we demonstrate the range of scenarios where DPGAS can be realized over a shared 10 Gbps Ethernet link with normal network traffic.","PeriodicalId":262148,"journal":{"name":"International Conference on Green Computing","volume":"4 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2010-08-15","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"132570470","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Stretch and compress based re-scheduling techniques for minimizing the execution times of DAGs on multi-core processors under energy constraints","authors":"David King, I. Ahmad, Hafiz Fahad Sheikh","doi":"10.1109/GREENCOMP.2010.5598274","DOIUrl":"https://doi.org/10.1109/GREENCOMP.2010.5598274","url":null,"abstract":"Given an initial schedule of a parallel program represented by a directed acyclic graph (DAG) and an energy constraint, the question arises how to effectively determine what nodes (tasks) can be penalized (slowed down) through the use of dynamic voltage scaling. The resulting re-schedule length with a strict energy budget should have a minimum amount of expansion compared to the original schedule achieved with full energy. We propose three static schemes that aim to achieve this goal. Each scheme encompasses submitting a schedule to either a conceptual “stretch” (starting tasks with a maximum voltage supplied to all cores followed by methodical voltage reductions) or “compress” (starting tasks with a minimum voltage supplied to all cores followed by methodical voltage boosts). The complexity arises due to the inter-dependence of tasks. We propose methods that efficiently make such findings by analyzing the DAG and determining the “impact factor” of a node in the graph for the purpose of guiding the schedule toward the desired goal. The comparison between the stretch-alone and compress-alone based algorithms leads to a third algorithm that employs schedule “compression,” but reschedules all cores following each successive voltage adjustment. Detailed simulation experiments demonstrate the effect of various task and processor parameters on the performance of the proposed algorithms.","PeriodicalId":262148,"journal":{"name":"International Conference on Green Computing","volume":"173 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2010-08-15","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"132597510","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Environmentally Opportunistic Computing transforming the data center for economic and environmental sustainability","authors":"P. Brenner, Ryan Jansen, D. Go, D. Thain","doi":"10.1109/GREENCOMP.2010.5598289","DOIUrl":"https://doi.org/10.1109/GREENCOMP.2010.5598289","url":null,"abstract":"The United States Environmental Protection Agency forecasts the 2011 national IT electric energy expenditure will grow toward $7.4 billion [1]. In parallel to economic IT energy concerns, the general public and environmental advocacy groups are demanding proactive steps toward sustainable green processes. Our contribution to the solution of this problem is Environmentally Opportunistic Computing (EOC). Our Green Cloud EOC prototype serves as an operational demonstration that IT resources can be integrated with the dominate energy footprint of existing facilities and dynamically controlled to balance process throughput, thermal transfer, and available cooling via process management and migration. The Green Cloud is a sustainable computing technology that complements existing efficiency improvements at the application, operating system and hardware levels. Exhaust heat energy is transferred directly to an adjacent greenhouse facility and cooling is provided by free cooling methods. We will describe the architecture and operation of this successful prototype that has led to its growing use in our production environments.","PeriodicalId":262148,"journal":{"name":"International Conference on Green Computing","volume":"243 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2010-08-15","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"133280613","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"The SNIC/KTH PRACE prototype: Achieving high energy efficiency with commodity technology without acceleration","authors":"S. Johnsson, Daniel Ahlin, John Wang","doi":"10.1109/GREENCOMP.2010.5598259","DOIUrl":"https://doi.org/10.1109/GREENCOMP.2010.5598259","url":null,"abstract":"Energy efficiency has become one of the most important considerations for HPC systems, particularly for large scale systems, for economic and environmental reasons and in exceptional cases also social and political. Many approaches are currently being pursued both in regards to architecture and hardware and software technologies to improve energy efficiency for HPC systems. The prototype described here, one of several within the PRACE project exploring improved energy efficiency, explores energy efficiency achievable through use of commodity components for cost effectiveness, and without acceleration for preservation/ease of portability of the large application code base that exists for the type of HPC systems that have been dominating for a decade. The prototype development was a collaborative effort between industry and academia. With a very limited budget for a server design project and severe time constraints the novelty was effectively limited to careful component choices in regards to energy efficiency for HPC workloads and a new motherboard design to support the component choices. A further constraint was that the outcome would be of production quality in order for the industry partners to market the prototype design should it be successful. For the component choices we did a characterization of the power consumption of a blade chassis and made an effort to measure the energy consumption of different memory modules under HPC workloads, information we could not find neither in the literature nor from memory or system vendors. Memory power consumption in the prototype, as well as most HPC systems, is second only to the CPU, sometimes a close second. We report on the design of the prototype, and preliminary performance results with an emphasis on the energy aspects of benchmarks and compare our results with the Blue Gene/P that, after its introduction, has dominated the top of the Green500 list for systems not using acceleration. The preliminary results show that energy efficiency comparable to the BG/P can be achieved without any proprietary technology at a fraction of the cost. The prototype design is now included in the standard product line of the participating platform vendor.","PeriodicalId":262148,"journal":{"name":"International Conference on Green Computing","volume":"159 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2010-08-15","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"134297702","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Nonclairvoyantly scheduling power-heterogeneous processors","authors":"Anupam Gupta, Ravishankar Krishnaswamy, K. Pruhs","doi":"10.1109/GREENCOMP.2010.5598311","DOIUrl":"https://doi.org/10.1109/GREENCOMP.2010.5598311","url":null,"abstract":"We show that a natural nonclairvoyant online algorithm for scheduling jobs on a power-heterogeneous multiprocessor is bounded-speed bounded-competitive for the objective of flow plus energy.","PeriodicalId":262148,"journal":{"name":"International Conference on Green Computing","volume":"51 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2010-08-15","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"134537381","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Thermal and power-aware task scheduling for Hadoop based storage centric datacenters","authors":"Bing Shi, Ankur Srivastava","doi":"10.1109/GREENCOMP.2010.5598262","DOIUrl":"https://doi.org/10.1109/GREENCOMP.2010.5598262","url":null,"abstract":"Apache Hadoop is a framework for managing large scale storage based datacenters whose primary job is to deliver data to clients. In such systems, the primary job is to associate each data request to a specific data replica among many available replicas. This assignment impacts the workload and power distribution across the storage servers. In this paper, we explore thermal and power aware task scheduling for Hadoop based storage centric datacenters. In order to maintain the reliability of datacenters, we would like to make sure that each node in the datacenter operates at a temperature below a certain temperature threshold. At the same time, we would like to minimize the total power consumption in the air conditioning (A/C) system that provides the cooling for maintaining the temperature. We formulate the resultant optimization problem as an Integer Linear Programming problem and develop minimum cost flow based heuristic to solve the problem. The experimental result shows that, our method forces the A/C system to output air temperature only 0.69K lower on average compared to the optimal ILP solution. However, the runtime of our method is only 1%–2.5% of the runtime using ILP solver. Also, random selection of data replica for each data request results in the required A/C output air temperature to be 6.35K lower than our method, which forces the A/C system to work harder.","PeriodicalId":262148,"journal":{"name":"International Conference on Green Computing","volume":"11 3 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2010-08-15","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"133057765","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Zhao Lei, Hui Xu, D. Ikebuchi, H. Amano, T. Sunata, M. Namiki
{"title":"Reducing instruction TLB's leakage power consumption for embedded processors","authors":"Zhao Lei, Hui Xu, D. Ikebuchi, H. Amano, T. Sunata, M. Namiki","doi":"10.1109/GREENCOMP.2010.5598277","DOIUrl":"https://doi.org/10.1109/GREENCOMP.2010.5598277","url":null,"abstract":"This paper presents a leakage efficient instruction TLB (Translation Lookaside Buffer) design for embedded processors. The key observation is that when programs enter a physical page following instructions tend to be fetched from the same page for a rather long time. Thus, by employing a small storage structure which stores the recent address-translation information, the TLB access frequency can be drastically decreased and the instruction TLB can be turned into the low leakage mode with the dual voltage supply technique. Based on such a design philosophy, three different implementation policies are proposed. Evaluation results with eight MiBench programs show that the proposed design can reduce the leakage power of the instruction TLB by 50% on average, with only 0.01% performance degradation.","PeriodicalId":262148,"journal":{"name":"International Conference on Green Computing","volume":"6 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2010-08-15","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"117161530","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Accurate modeling and prediction of energy availability in energy harvesting real-time embedded systems","authors":"Jun Lu, Shaobo Liu, Qing Wu, Qinru Qiu","doi":"10.1109/GREENCOMP.2010.5598280","DOIUrl":"https://doi.org/10.1109/GREENCOMP.2010.5598280","url":null,"abstract":"Energy availability is the primary subject that drives the research innovations in energy harvesting systems. In this paper, we first propose a novel concept of effective energy dissipation that defines a unique quantity to accurately quantify the energy dissipation of the system by including not only the energy demand by the electronic circuit, but also the energy overhead incurred by energy flows amongst system components. This work also addresses the techniques in run-time prediction of future harvested energy. These two contributions significantly improve the accuracy of energy availability computation for the proposed Model-Accurate Predictive DVFS algorithm, which aims at achieving best system performance under energy harvesting constraints. Experimental results show the improvements achieved by the MAP-DVFS algorithm in deadline miss rate. In addition, we illustrate the trend of system performance variation under different conditions and system design parameters.","PeriodicalId":262148,"journal":{"name":"International Conference on Green Computing","volume":"8 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2010-08-15","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"128447231","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}