Seyedeh Newsha Estiri, Amir Hossein Jalilvand, S. Naderi, M. Najafi, Mahdi Fazeli
{"title":"A Low-Cost Stochastic Computing-based Fuzzy Filtering for Image Noise Reduction","authors":"Seyedeh Newsha Estiri, Amir Hossein Jalilvand, S. Naderi, M. Najafi, Mahdi Fazeli","doi":"10.1109/IGSC55832.2022.9969358","DOIUrl":"https://doi.org/10.1109/IGSC55832.2022.9969358","url":null,"abstract":"Images are often corrupted with noise. As a result, noise reduction is an important task in image processing. Common noise reduction techniques, such as mean or median filtering, lead to blurring of the edges in the image, while fuzzy filters are able to preserve the edge information. In this work, we implement an efficient hardware design for a well-known fuzzy noise reduction filter based on stochastic computing. The filter consists of two main stages: edge detection and fuzzy smoothing. The fuzzy difference, which is encoded as bit-streams, is used to detect edges. Then, fuzzy smoothing is done to average the pixel value based on eight directions. Our experimental results show a significant reduction in the hardware area and power consumption compared to the conventional binary implementation while preserving the quality of the results.","PeriodicalId":114200,"journal":{"name":"2022 IEEE 13th International Green and Sustainable Computing Conference (IGSC)","volume":"70 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2022-10-24","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"127351568","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Exploring Automatic Gym Workouts Recognition Locally on Wearable Resource-Constrained Devices","authors":"Sizhen Bian, Xiaying Wang, T. Polonelli, M. Magno","doi":"10.1109/IGSC55832.2022.9969370","DOIUrl":"https://doi.org/10.1109/IGSC55832.2022.9969370","url":null,"abstract":"Automatic gym activity recognition on energy-and resource-constrained wearable devices removes the human-interaction requirement during intense gym sessions - like soft-touch tapping and swiping. This work presents a tiny and highly accurate residual convolutional neural network that runs in milliwatt microcontrollers for automatic workouts classification. We evaluated the inference performance of the deep model with quantization on three resource-constrained devices: two microcontrollers with ARM-Cortex M4 and M7 core from ST Microelectronics, and a GAP8 system on chip, which is an open-sourced, multi-core RISC-V computing platform from Green-Waves Technologies. Experimental results show an accuracy of up to 90.4% for eleven workouts recognition with full precision inference. The paper also presents the trade-off performance of the resource-constrained system. While keeping the recognition accuracy (88.1%) with minimal loss, each inference takes only 3.2 ms on GAP8, benefiting from the 8 RISC-V cluster cores. We measured that it features an execution time that is 18.9x and 6.5x faster than the Cortex-M4 and Cortex-M7 cores, showing the feasibility of real-time on-board workouts recognition based on the described data set with 20 Hz sampling rate. The energy consumed for each inference on GAP8 is 0.41 mJ compared to 5.17 mJ on Cortex-M4 and 8.07 mJ on Cortex-M7 with the maximum clock. It can lead to longer battery life when the system is battery-operated. We also introduced an open data set composed of fifty sessions of eleven gym workouts collected from ten subjects that is publicly available.","PeriodicalId":114200,"journal":{"name":"2022 IEEE 13th International Green and Sustainable Computing Conference (IGSC)","volume":"14 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2022-10-24","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"116840014","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Electrical Commissioning Owner's Project Requirements: A Template","authors":"Brandon Hong, E. Thomason, Aditya M. Deshpande","doi":"10.1109/IGSC55832.2022.9969369","DOIUrl":"https://doi.org/10.1109/IGSC55832.2022.9969369","url":null,"abstract":"As the power demands of Supercomputers continue to grow, so does the demands of the electrical systems that support the infrastructure and building in which these Supercomputers reside. A typical new Supercomputer installation requires an upgrade to the design of the electrical system. As Supercomputers are refreshed roughly every 3 years which in turn drives electrical systems upgrades. The pre-design phase is critical for planning the installation of a new Supercomputer and requires documenting the overarching project purpose, goals, expectations, preferences, and limitations for the electrical systems, especially as the number of stakeholders increases. This Owner's Project Requirements (OPR) document then becomes guidance to the engineering and design teams for the development of the initial basis-of-design and subsequent construction documents. The electrical systems commissioning OPR provides a guideline for stakeholders to make sure that the electrical systems are well designed ‘up-front’ in the process of installation of a new Supercomputer. It also serves as a guiding checklist for the reader to use to inform their own generation of project guiding documents. This document will assist the owner and respective HPC infrastructure stakeholders in writing an OPR for the electrical systems supporting data centers or high-performance computing (HPC) facilities. This paper provides a template for developing an electrical system commissioning OPR. The template is sub-divided into sections that should be discussed and documented as part of the overall project requirements. The expectation is that this outline template forms a starting point for discussions for generating a guiding document for the commissioning of the electrical systems and standardizes the best practices and processes needed for the certification of the electrical commissioning of the HPC Supercomputers facilities.","PeriodicalId":114200,"journal":{"name":"2022 IEEE 13th International Green and Sustainable Computing Conference (IGSC)","volume":"64 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2022-10-24","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"126551697","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
J. Corbalán, Lluis Alonso, C. Navarrete, Carla Guillén
{"title":"Soft Cluster Powercap at SuperMUC-NG with EAR","authors":"J. Corbalán, Lluis Alonso, C. Navarrete, Carla Guillén","doi":"10.1109/IGSC55832.2022.9969360","DOIUrl":"https://doi.org/10.1109/IGSC55832.2022.9969360","url":null,"abstract":"This paper describes the Soft cluster powercap management system implemented and evaluated on SuperMUC-NG using the EAR software. SuperMUC-NG is one of the biggest supercomputers in Europe with 6480 Intel Skylake Xeon Platinum 8174 and EAR is the system software used for energy management. SuperMUC-NG has a power limit with a certain degree of tolerance, being possible to exceed the limit for a short time, as long as the power is on average under the hard limit over a longer period. Otherwise, the data center would incur a cost penalty. We call this use case Soft Cluster Powercap, since it is different from the traditional Hard Cluster Powercap where the power limit cannot be exceeded. This paper presents the design of the EAR node and Soft Cluster Powercap and the evaluation of the EAR node powercap and the soft cluster powercap. The evaluation included in this paper has been limited to CPU-only kernels and applications for the node powercap and to one island of SuperMUC-NG (792 nodes) for the soft cluster powercap. Currently the solution is deployed in the whole cluster.","PeriodicalId":114200,"journal":{"name":"2022 IEEE 13th International Green and Sustainable Computing Conference (IGSC)","volume":"1 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2022-10-24","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"130256308","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Unified Cross-Layer Cluster-Node Scheduling for Heterogeneous Datacenters","authors":"Wenkai Guan, Cristinel Ababei","doi":"10.1109/IGSC55832.2022.9969366","DOIUrl":"https://doi.org/10.1109/IGSC55832.2022.9969366","url":null,"abstract":"In this paper, we present a two-level hierarchical scheduler for datacenters called Qin. The goal of the proposed scheduler is to exploit increased server heterogeneity. It combines in a unified approach cluster and node level scheduling algorithms, and it can consider specific optimization objectives including job completion time, energy usage, and energy delay product (EDP). Its novelty lies in the unified approach and in modeling interference and heterogeneity. Experiments on a real cluster demonstrate the proposed approach outperforms state-of-the-art schedulers by 10.2 % in completion time, 38.65 % in energy usage, and 41.98% in EDP.","PeriodicalId":114200,"journal":{"name":"2022 IEEE 13th International Green and Sustainable Computing Conference (IGSC)","volume":"318 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2022-10-24","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"124506391","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Optimal Launch Bound Selection in CPU-GPU Hybrid Graph Applications with Deep Learning","authors":"Md. Erfanul Haque Rafi, Apan Qasem","doi":"10.1109/IGSC55832.2022.9969364","DOIUrl":"https://doi.org/10.1109/IGSC55832.2022.9969364","url":null,"abstract":"Graph algorithms, which are at heart of emerging computation domains such as machine learning, are notoriously difficult to optimize because of their irregular behavior. The challenges are magnified on current CPU-GPU heterogeneous platforms. In this paper, we study the problem of GPU launch bound configuration in hybrid graph algorithms. We train a multi-objective deep neural network to learn a function that maps input graph characteristics and runtime program behavior to a set of launch bound parameters. When applying launch bounds predicted by our neural network in BFS and SSSP algorithms, we observe as much as 2.76× speedup on certain graph instances and an overall speedup of 1.31 and 1.61, respectively. Similar improvements are seen in energy efficiency of the applications, with an average reduction of 14% in peak power consumption across 20 real-world input graphs. Evaluation of the neural network shows that it is robust and generalizable and yields close to a 90% accuracy on cross-validation.","PeriodicalId":114200,"journal":{"name":"2022 IEEE 13th International Green and Sustainable Computing Conference (IGSC)","volume":"43 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2022-10-24","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"130619365","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Energy-Performance-Security Trade-off in Mobile Edge Computing","authors":"Mahipal P. Singh, S. Sankaran","doi":"10.1109/IGSC55832.2022.9969375","DOIUrl":"https://doi.org/10.1109/IGSC55832.2022.9969375","url":null,"abstract":"Multi-access Edge Computing (MEC) also known as Mobile Edge Computing is a type of edge computing that extends the capabilities of cloud computing by bringing resources to the edge of the network. Traditional cloud computing occurs on remote servers far away from users and IoT devices, whereas MEC allows computing processes to take place at base stations, central offices or other aggregation points within the transport network. However, localization of MEC nodes near the data-generating devices gives rise to several challenges such as launch of several attacks from data-generating or network edge devices on the nodes. In this paper, we simulate Distributed Denial of Service (DDoS) and Routing-based attacks to determine their impact on energy and performance. In addition, we propose a novel approach for mitigating the DDoS attack in MEC nodes. Our approach accurately discriminates between high-rate and low-rate DDoS attacks and provides defence against both of them. Our method has a 90% success rate in successfully detecting and thwarting DDoS attack on MEC nodes.","PeriodicalId":114200,"journal":{"name":"2022 IEEE 13th International Green and Sustainable Computing Conference (IGSC)","volume":"1 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2022-10-24","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"130855511","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"A Review of Smart Buildings Protocol and Systems with a Consideration of Security and Energy Awareness","authors":"Mini Zeng","doi":"10.1109/IGSC55832.2022.9969359","DOIUrl":"https://doi.org/10.1109/IGSC55832.2022.9969359","url":null,"abstract":"In this paper, we discuss different smart building communication protocols or systems. We discuss the security features of the existing smart building communication protocols systems. We also discuss the possible attacks and vulnerabilities relying on the building automation systems. We provide the taxonomy of the most popular smart building communication protocols considering security features and energy-saving solutions. The motivation of this paper is to guide the designers and developers of smart building systems with the consideration of security and energy awareness.","PeriodicalId":114200,"journal":{"name":"2022 IEEE 13th International Green and Sustainable Computing Conference (IGSC)","volume":"1 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2022-10-24","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"122351416","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Raptor: Mitigating CPU-GPU False Sharing Under Unified Memory Systems","authors":"Md. Erfanul Haque Rafi, Kaylee Williams, Apan Qasem","doi":"10.1109/IGSC55832.2022.9969376","DOIUrl":"https://doi.org/10.1109/IGSC55832.2022.9969376","url":null,"abstract":"The introduction of Unified Memory (UM) technology has greatly increased the programmability of CPU-GPU heterogeneous systems. At the same time, Unified Memory systems have given rise to new performance challenges. Achieving the desired performance and energy efficiency on such systems requires careful consideration of data allocation and migration. This paper looks at the problem of false sharing under UM. We present Raptor, a system for fast and accurate detection of page-level false sharing in heterogeneous applications. The system employs binary code instrumentation and leverages hardware performance counters to track UM allocations and data access patterns and pinpoint energy inefficiencies created by the occurrence of false sharing. Experiments on a suite of heterogeneous applications show false sharing can be a common occurrence in collaborative design paradigms with tight coupling of CPU-GPU tasks. When false sharing is eliminated via a padding scheme, applications are able to achieve higher performance at lower clock frequencies, leading to improved energy efficiency by as much as 2.96× and by 1.62× and 1.47× on average on two contemporary CPU-GPU platforms.","PeriodicalId":114200,"journal":{"name":"2022 IEEE 13th International Green and Sustainable Computing Conference (IGSC)","volume":"18 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2022-10-24","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"128634426","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Muhammad Sabih, Ashutosh Mishra, Frank Hannig, Jürgen Teich
{"title":"MOSP: Multi-Objective Sensitivity Pruning of Deep Neural Networks","authors":"Muhammad Sabih, Ashutosh Mishra, Frank Hannig, Jürgen Teich","doi":"10.1109/IGSC55832.2022.9969374","DOIUrl":"https://doi.org/10.1109/IGSC55832.2022.9969374","url":null,"abstract":"Deep neural networks (DNNs) are computationally intensive, making them difficult to deploy on resource-constrained embedded systems. Model compression is a set of techniques that removes redundancies from a neural network with affordable degradation in task performance. Most compression methods do not target hardware-based objectives such as latency directly; however, few methods approximate latency with floating-point operations (FLOPs) or multiply-accumulate operations (MACs). Using these indirect metrics cannot directly translate to the relevant performance metric on the hardware, i.e., latency and throughput. To address this limitation, we introduce Multi-Objective Sensitivity Pruning, “MOSP,” a three-stage pipeline for filter pruning: hardware-aware sensitivity analysis, Criteria-optimal configuration selection, and pruning based on explainable AI (XAI). Our pipeline is compatible with a single or combination of target objectives such as latency, energy consumption, and accuracy. Our method first formulates the sensitivity of layers of a model against the target objectives as a classical machine learning problem. Next, we choose a Criteria-optimal configuration controlled by hyperparameters specific to each objective of choice. Finally, we apply XAI-based filter ranking to select filters to be pruned. The pipeline follows an iterative pruning methodology to recover any loss in degradation in task performance (e.g., accuracy). We allow the user to prefer one objective function over the other. Our method outperforms the selected baseline method across different neural networks and datasets in both accuracy and latency reductions and is competitive with state-of-the-art approaches.","PeriodicalId":114200,"journal":{"name":"2022 IEEE 13th International Green and Sustainable Computing Conference (IGSC)","volume":"158 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2022-10-24","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"127546916","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}