{"title":"Dynamic Supply Voltage Level Generation for Minimum Energy Real Time Tasks using Geometric Programming","authors":"H. Manohara, B. Harish","doi":"10.1109/SOCC46988.2019.1570555698","DOIUrl":"https://doi.org/10.1109/SOCC46988.2019.1570555698","url":null,"abstract":"Communication and computation are moving towards mobile platforms to address the demands of emerging applications. Despite advances in process and battery technologies that allow processors to provide much greater computation per unit of energy and longer life of battery, the fundamental tradeoff between performance and battery life continues to remain critical. To maximize energy efficiency of processors in mobile electronics, Dynamic Voltage Scaling (DVS) is conventionally deployed to dynamically vary supply voltage and hence speed, at run time. The nonlinear relationship between CPU speed and power consumption enables spread out of task execution in time domain by leveraging on the available slackness by reducing voltage, than to run the CPU at full speed for short bursts and then switch to idle state. The proposed work aims to minimize the energy consumption of each task of real time periodic task sets, in a uniprocessor environment, using task utilization factor as a control variable for generating the optimized supply voltage to every task of task sets. The energy minimization of a task is implemented using Geometric Programming (GP), by varying frequency over a range on fixed task sets and on randomly varying task set instances and hence generating supply voltage levels. Results demonstrate that energy savings vary between 18% to 34%, for standard task sets and an average of 77% for randomly generated task sets, depending on the power delay characteristics of task sets.","PeriodicalId":253998,"journal":{"name":"2019 32nd IEEE International System-on-Chip Conference (SOCC)","volume":"43 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2019-09-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"130748151","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Fused Multiply-Add for Variable Precision Floating-Point","authors":"A. Nannarelli","doi":"10.1109/SOCC46988.2019.1570555329","DOIUrl":"https://doi.org/10.1109/SOCC46988.2019.1570555329","url":null,"abstract":"In this work, we address the design of a Fused Multiply-Add (FMA) in Tunable Floating-Point (TFP). TFP is a floating-point variable precision format in which a given precision for significand and exponent can be chosen for a single operation. The objective is to increase the power efficiency of the computation by tuning the precision of algorithms that can tolerate some error. The performance of the FMA is compared to that of separate multiply and add units on computation kernels used in several applications.","PeriodicalId":253998,"journal":{"name":"2019 32nd IEEE International System-on-Chip Conference (SOCC)","volume":"42 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2019-09-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"116567998","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Cycle-Accurate Evaluation of Software-Hardware Co-Design of Decimal Computation in RISC-V Ecosystem","authors":"Riaz-ul-haque Mian, Michihiro Shintani, M. Inoue","doi":"10.1109/SOCC46988.2019.1570559752","DOIUrl":"https://doi.org/10.1109/SOCC46988.2019.1570559752","url":null,"abstract":"Software-hardware co-design solutions for decimal computation can provide several Pareto points to development of embedded systems in terms of hardware cost and performance. This paper demonstrates how to accurately evaluate such co-design solutions using RISC-V ecosystem. In a software-hardware co-design solution, a part of solution requires dedicated hardware. In our evaluation framework, we develop new decimal oriented instructions supported by an accelerator. The framework can realize cycle-accurate analysis for performance as well as hardware overhead for co-design solutions for decimal computation. The obtained performance result is compared with an estimation with dummy functions.","PeriodicalId":253998,"journal":{"name":"2019 32nd IEEE International System-on-Chip Conference (SOCC)","volume":"21 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2019-09-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"133812270","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
W. Kuo, Li-Wei Liu, Yen-Chin Liao, Hsie-Chia Chang
{"title":"ML-based Thermal Sensor Calibration by Bivariate Gaussian Mixture Model Estimation","authors":"W. Kuo, Li-Wei Liu, Yen-Chin Liao, Hsie-Chia Chang","doi":"10.1109/SOCC46988.2019.1570561880","DOIUrl":"https://doi.org/10.1109/SOCC46988.2019.1570561880","url":null,"abstract":"This paper presents a machine-learning-based post signal processing to calibrate thermal sensors. The proposed calibration scheme is shown to be immune to the interference from the environment and fulfills the high-resolution requirements of human body temperature measurements. The sensing module comprises two resistive sensing circuits, one is for sensing the external temperature, and the other is for sensing the internal die temperature. By using these two thermal outputs, we trained two-dimensional multivariate Gaussian models for several temperature intervals. Higher accuracy can be obtained via the probability-based estimation. The simulation results show high accuracy even in a noisy environment. The proposed algorithm is implemented and fabricated in UMC 0.18m CMOS-MEMS technology. The sensor chip is tested by an embedded system (ARM V2M-MPS2). The measurement results show that the proposed method can effectively improve the accuracy from 1 degree Celsius to 0.1 degree Celsius.","PeriodicalId":253998,"journal":{"name":"2019 32nd IEEE International System-on-Chip Conference (SOCC)","volume":"74 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2019-09-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"123086160","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
De-Xuan Ji, Hsiao-Yu Chiang, Chia-Chun Lin, Chia-Cheng Wu, Yung-Chih Chen, Chun-Yao Wang
{"title":"A Glitch Key-Gate for Logic Locking","authors":"De-Xuan Ji, Hsiao-Yu Chiang, Chia-Chun Lin, Chia-Cheng Wu, Yung-Chih Chen, Chun-Yao Wang","doi":"10.1109/SOCC46988.2019.1570547988","DOIUrl":"https://doi.org/10.1109/SOCC46988.2019.1570547988","url":null,"abstract":"Logic locking is a technique used for intellectual property protection. An effective attacking method based on satisfiability (SAT) algorithm, known as SAT attack, was proposed to decrypt an encrypted design successfully. To strengthen logic locking, this paper proposes a glitch-based logic locking method designed for sequential circuits. The proposed new schemes of key-gates can generate glitches, and use rising and falling transitions as key-inputs for the comprehensive logic locking. Experimental results show that the proposed glitch key-gate (GK) has high capability to be embedded in a set of IWLS2005 Benchmarks [22]. The cell area overhead in the designs encrypted with GKs are 10.68%, 12.22%, and 26.11% on average for encryptions with 8, 16, and 32 key-inputs, respectively, and the overhead can be reduced substantially when the GKs are combined with other logic locking methods.","PeriodicalId":253998,"journal":{"name":"2019 32nd IEEE International System-on-Chip Conference (SOCC)","volume":"21 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2019-09-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"128530463","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Duy-Anh Nguyen, Duy-Hieu Bui, F. Iacopi, Xuan-Tu Tran
{"title":"An Efficient Event-driven Neuromorphic Architecture for Deep Spiking Neural Networks","authors":"Duy-Anh Nguyen, Duy-Hieu Bui, F. Iacopi, Xuan-Tu Tran","doi":"10.1109/SOCC46988.2019.1570548305","DOIUrl":"https://doi.org/10.1109/SOCC46988.2019.1570548305","url":null,"abstract":"Deep Neural Networks (DNNs) have been successfully applied to various real-world machine learning applications. However, performing large DNN inference tasks in real-time remains a challenge due to its substantial computational costs. Recently, Spiking Neural Networks (SNNs) have emerged as an alternative way of processing DNN’fs task. Due to its eventbased, data-driven computation, SNN reduces both inference latency and complexity. With efficient conversion methods from traditional DNN, SNN exhibits similar accuracy, while leveraging many state-of-the-art network models and training methods. In this work, an efficient neuromorphic hardware architecture for image recognition task is presented. To preserve accuracy, the analog-to-spiking conversion algorithm is adopted. The system aims to minimize hardware area cost and power consumption, enabling neuromorphic hardware processing in edge devices. Simulation results have shown that, with the MNIST digit recognition task, the system has achieved $times 20$ reduction in terms of core area cost compared to the state-of-the-art works, with an accuracy of 94.4%, core area of 15 $mu m^{2}$ at a maximum frequency of 250 MHz.","PeriodicalId":253998,"journal":{"name":"2019 32nd IEEE International System-on-Chip Conference (SOCC)","volume":"34 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2019-09-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"128589156","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Debjyoti Bhattacharjee, A. Chattopadhyay, Ricardo Jack Liwongan
{"title":"Accelerating Binary-Matrix Multiplication on FPGA","authors":"Debjyoti Bhattacharjee, A. Chattopadhyay, Ricardo Jack Liwongan","doi":"10.1109/SOCC46988.2019.1570544215","DOIUrl":"https://doi.org/10.1109/SOCC46988.2019.1570544215","url":null,"abstract":"Matrix multiplication is required for a wide variety of applications, including data mining, linear algebra, graph transformations, etc. Most of the existing works to accelerate matrix multiplication have focused on matrices with floating point elements. In this work, we propose for the first time an FPGA based accelerator architecture for binary matrix multiplication. It consists of processing elements laid out in regular tiled manner. The communication structure used is a torus. We undertook detailed experimental study of the proposed architecture. The architecture shows excellent scalability with increase in number of processing elements, with minimal drop in operating frequency. The proposed system achieves maximum throughput of 1120 Gops for $4 times 4$ network size with $2048 times 2048$ matrix size. The performance achieved by the system is considerably higher than existing works of floating point multiplication on FPGAs, due to optimized PE design for binary matrix multiplication.","PeriodicalId":253998,"journal":{"name":"2019 32nd IEEE International System-on-Chip Conference (SOCC)","volume":"33 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2019-09-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"114477496","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
T. Goldbrunner, N. Doan, Diogo Poças, Thomas Wild, A. Herkersdorf
{"title":"Register Requirement Minimization of Fixed-Depth Pipelines for Streaming Data Applications","authors":"T. Goldbrunner, N. Doan, Diogo Poças, Thomas Wild, A. Herkersdorf","doi":"10.1109/SOCC46988.2019.1570548393","DOIUrl":"https://doi.org/10.1109/SOCC46988.2019.1570548393","url":null,"abstract":"We present a method that can be used to map control/data flow graphs into fixed-depth pipelines targeted at FPGA design. The main objective for the design is to reduce the register resources which are needed to forward data within the processing pipeline. We show that these requirements can be reduced by appropriate task scheduling. Starting from an intuitive network flow based scheduling approach, we develop a linear programming model of the task scheduling problem. This allows us to efficiently create schedules which are provably optimal with regard to the objective of minimal register usage.","PeriodicalId":253998,"journal":{"name":"2019 32nd IEEE International System-on-Chip Conference (SOCC)","volume":"74 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2019-09-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"125474157","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}