Ugo Mureddu, Brice Colombier, Nathalie Bochard, L. Bossuet, V. Fischer
{"title":"Transient Effect Ring Oscillators Leak Too","authors":"Ugo Mureddu, Brice Colombier, Nathalie Bochard, L. Bossuet, V. Fischer","doi":"10.1109/ISVLSI.2019.00016","DOIUrl":"https://doi.org/10.1109/ISVLSI.2019.00016","url":null,"abstract":"Up to now, the transient effect ring oscillator (TERO) seemed to be a better building block for PUFs than a standard ring oscillator, since it was thought to be immune to electromagnetic analysis. Here, we report for the first time that TERO PUFs are in fact vulnerable to electromagnetic analysis too. First, we propose a spectral model of a TERO cell output, showing how to fit it to experimental data obtained with the help of a spectrum analyser to recover the number of oscillations of a TERO cell. We then extend it to two TERO cells oscillating simultaneously, and show how this ability can be used to fully clone a TERO PUF. These results should help designers to better plan for susceptibility of TERO PUFs to electromagnetic analysis in their future designs.","PeriodicalId":6703,"journal":{"name":"2019 IEEE Computer Society Annual Symposium on VLSI (ISVLSI)","volume":"433 1","pages":"37-42"},"PeriodicalIF":0.0,"publicationDate":"2019-07-15","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"91551339","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Alexandros Kouris, Stylianos I. Venieris, C. Bouganis
{"title":"Towards Efficient On-Board Deployment of DNNs on Intelligent Autonomous Systems","authors":"Alexandros Kouris, Stylianos I. Venieris, C. Bouganis","doi":"10.1109/ISVLSI.2019.00107","DOIUrl":"https://doi.org/10.1109/ISVLSI.2019.00107","url":null,"abstract":"With their unprecedented performance in major AI tasks, deep neural networks (DNNs) have emerged as a primary building block in modern autonomous systems. Intelligent systems such as drones, mobile robots and driverless cars largely base their perception, planning and application-specific tasks on DNN models. Nevertheless, due to the nature of these applications, such systems require on-board local processing in order to retain their autonomy and meet latency and throughput constraints. In this respect, the large computational and memory demands of DNN workloads pose a significant barrier on their deployment on the resource-and power-constrained compute platforms that are available on-board. This paper presents an overview of recent methods and hardware architectures that address the system-level challenges of modern DNN-enabled autonomous systems at both the algorithmic and hardware design level. Spanning from latency-driven approximate computing techniques to high-throughput mixed-precision cascaded classifiers, the presented set of works paves the way for the on-board deployment of sophisticated DNN models on robots and autonomous systems.","PeriodicalId":6703,"journal":{"name":"2019 IEEE Computer Society Annual Symposium on VLSI (ISVLSI)","volume":"9 1","pages":"568-573"},"PeriodicalIF":0.0,"publicationDate":"2019-07-15","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"81896013","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Yiming Hu, Shuang Liang, Jincheng Yu, Yu Wang, Huazhong Yang
{"title":"On-Chip Instruction Generation for Cross-Layer CNN Accelerator on FPGA","authors":"Yiming Hu, Shuang Liang, Jincheng Yu, Yu Wang, Huazhong Yang","doi":"10.1109/ISVLSI.2019.00011","DOIUrl":"https://doi.org/10.1109/ISVLSI.2019.00011","url":null,"abstract":"Convolutional neural networks (CNN) are gaining popularity in the field of computer vision. CNN-based methods are computational-intensive and resource-consuming, thus are hard to be integrated into embedded systems and applied to real-time task scenarios. Many FPGA based CNN accelerators have been proposed to get higher performance. Cross-layer CNN accelerator is designed to reduce the data transfer by fusing several layers. However, the instruction size that needs to be transferred is usually considerable, leading to a performance drop of cross-layer accelerators. In this study, we develop an on-chip instruction generation method based on the cross-layer accelerator to reduce the total instruction size transferred to the chip. We design the corresponding hardware module and modify existing object detection models according to the hardware structure to improve the accuracy of object detection tasks. The evaluation results show that in the same calculation process, our accelerator can achieve 35% data transfer reduction on the VGG16 network. The average instruction size and compilation time are reduced by 95% using our instruction generation method. The performance of the accelerator reaches 1414 GOP/s.","PeriodicalId":6703,"journal":{"name":"2019 IEEE Computer Society Annual Symposium on VLSI (ISVLSI)","volume":"37 1","pages":"7-12"},"PeriodicalIF":0.0,"publicationDate":"2019-07-15","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"87279612","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"SPN-DPUF: Substitution-Permutation Network Based Secure Circuit for Digital PUF","authors":"Johan Marconot, D. Hély, Florian Pebay-Peyroula","doi":"10.1109/ISVLSI.2019.00018","DOIUrl":"https://doi.org/10.1109/ISVLSI.2019.00018","url":null,"abstract":"Securing integrated circuits lifecycle requires authentication mechanisms in order to prevent counterfeiting and to prevent illegal access to private assets. Physical unclonable functions (PUFs) are good candidates to provide authentication services. However, PUFs may be sensitive to noise and environmental conditions inducing reliability issues. Digital PUFs (DPUFs), which are by design inherently robust, have recently been proposed. In this paper, we investigate the utilization, the security and the efficiency of interrogation circuitries for DPUFs. We present the concept of digital disorder based PUF primitives and related work on fabrication processes and interrogation circuitries for DPUFs, discussing their advantage and limitation. We then study the requirements to exploit this digital and reliable source of entropy and deploy a strong PUF design. We propose new models of logical layers for challenge-response mechanism based on substitution-permutation networks, which could be integrated along the randomized structure. We simulate and evaluate the different structures to estimate a first security-performance trade-off, respecting both security and resource constraints.","PeriodicalId":6703,"journal":{"name":"2019 IEEE Computer Society Annual Symposium on VLSI (ISVLSI)","volume":"9 1","pages":"49-54"},"PeriodicalIF":0.0,"publicationDate":"2019-07-15","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"87625711","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Not All Feed-Forward MUX PUFs Generate Unique Signatures","authors":"A. Ayling, S. V. S. Avvaru, K. Parhi","doi":"10.1109/ISVLSI.2019.00017","DOIUrl":"https://doi.org/10.1109/ISVLSI.2019.00017","url":null,"abstract":"A fundamental property of physical unclonable functions (PUFs) is that they generate unique outputs that cannot be reproduced by another chip, even with an identical circuit and layout design. Several configurations of feed-forward PUFs (FF PUFs) are evaluated in terms of their interchip variation, a measure of uniqueness. In general, PUFs are considered to be unique due to symmetry in their path delay distributions, which are typically Gaussian. In this paper, we prove that certain FF PUFs can result in skewed path delay distributions leading to poor uniqueness. In these PUFs, the total delay difference is sum of a symmetric Gaussian distribution and an asymmetric half-Gaussian distribution. We also compute empirical estimates and verify our observations by simulating 200 PUFs in each FF configuration. It is observed that (1) FF PUFs with one intermediate arbiter and odd number of feed-forward loops and (2) FF PUFs in cascade or separate configurations have degraded interchip variation. This is the first study to observe and prove the non-uniqueness property of such PUFs.","PeriodicalId":6703,"journal":{"name":"2019 IEEE Computer Society Annual Symposium on VLSI (ISVLSI)","volume":"118 1","pages":"43-48"},"PeriodicalIF":0.0,"publicationDate":"2019-07-15","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"87966051","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Ferroelectric FET Based TCAM Designs for Energy Efficient Computing","authors":"Xunzhao Yin, D. Reis, M. Niemier, X. Hu","doi":"10.1109/ISVLSI.2019.00085","DOIUrl":"https://doi.org/10.1109/ISVLSI.2019.00085","url":null,"abstract":"As Moore's law based device scaling and accompanying performance scaling trends slow down, there is increasing interest in new technologies and computational paradigms that enable faster and more energy-efficient information processing. Meanwhile, there is growing evidence that in the context of traditional Boolean circuits and/or von Neumann architectures, it will be challenging for beyond-CMOS devices to compete with the CMOS technology. Exploiting the unique characteristics of emerging devices – especially in the context of alternative circuits and architectural paradigms – has the potential to offer orders of magnitude improvement in terms of energy and/or performance. In this work, we show how our research work has leveraged the unique characteristics of emerging devices to build efficient circuits and architectures with significant improvements in energy and performance for various data-intensive applications. Specifically, we consider Ferroelectric FETs (FeFETs) which are nonvolatile and can function as both a transistor and a storage element. This unique property enables FeFETs to be used for building area efficient and low-power ternary content addressable memories (TCAMs). TCAMs are desirable in many applications including network routers and cognitive learning tasks. Using models calibrated by experimentally demonstrated ferroelectric material or device, as well as detailed circuit simulations, we show that the FeFET-based TCAMs we proposed can enable orders of magnitude improvements in energy efficiency and performance when considering array-level computing tasks in the IoT domain.","PeriodicalId":6703,"journal":{"name":"2019 IEEE Computer Society Annual Symposium on VLSI (ISVLSI)","volume":"252 1","pages":"437-442"},"PeriodicalIF":0.0,"publicationDate":"2019-07-15","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"72660694","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Countering Botnet of Things using Blockchain-Based Authenticity Framework","authors":"Pinchen Cui, Ujjwal Guin","doi":"10.1109/ISVLSI.2019.00112","DOIUrl":"https://doi.org/10.1109/ISVLSI.2019.00112","url":null,"abstract":"The success and widespread use of Internet of Things (IoT) bring remarkable contributions and economic benefits in various fields. However, the increasing number of devices also raises security concerns. The prevalence of Botnet of Things (BoT) has been observed and it has been recently reported that the launched attacks affect multiple domains and have caused unacceptable losses. As majority of IoT devices are manufactured off-shore, ensuring their identity becomes one of the major challenges. Cloned devices, with backdoors for malicious purposes, can provide an undue advantage of the adversary to compromise a system even though proper security measures are in place. In this paper, we propose a novel blockchain-based framework to provide traceability of hardware. A unique identity for every IoT device is ensured using a physically unclonable function (PUF). The blockchain provides the verification of these devices by comparing these unique IDs. HyperLedger is selected to implement the blockchain-based framework, and its performance is being evaluated and analyzed.","PeriodicalId":6703,"journal":{"name":"2019 IEEE Computer Society Annual Symposium on VLSI (ISVLSI)","volume":"27 1","pages":"598-603"},"PeriodicalIF":0.0,"publicationDate":"2019-07-15","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"88425572","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Defense-Net: Defend Against a Wide Range of Adversarial Attacks through Adversarial Detector","authors":"A. S. Rakin, Deliang Fan","doi":"10.1109/ISVLSI.2019.00067","DOIUrl":"https://doi.org/10.1109/ISVLSI.2019.00067","url":null,"abstract":"Recent studies have demonstrated that Deep Neural Networks(DNNs) are vulnerable to adversarial input perturbations: meticulously engineered slight perturbations can result in inappropriate categorization of valid images. Adversarial Training has been one of the successful defense approaches in recent times. In this work, we propose an alternative to adversarial training by training a separate model with adversarial examples instead of the original classifier. We train an adversarial detector network known as 'Defense-Net' with strong adversary while training the original classifier with only clean training data. We propose a new adversarial cross entropy loss function to train Defense-Net appropriately differentiate between different adversarial examples. Defense-Net solves three major concerns regarding the development of a successful adversarial defense method. First, our defense does not have clean data accuracy degradation in contrast to traditional adversarial training based defenses. Second, we demonstrate this resiliency with experiments on the MNIST and CIFAR-10 data sets, and show that the state-of-the-art accuracy under the most powerful known white-box attack was increased from 94.02 % to 99.2 % on MNIST, and 47 % to 94.79 % on CIFAR-10. Finally, unlike most recent defenses, our approach does not suffer from obfuscated gradient and can successfully defend strong BPDA, PGD, FGSM and C & W attacks.","PeriodicalId":6703,"journal":{"name":"2019 IEEE Computer Society Annual Symposium on VLSI (ISVLSI)","volume":"86 1","pages":"332-337"},"PeriodicalIF":0.0,"publicationDate":"2019-07-15","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"79004962","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Karthikeyan Nagarajan, Sina Sayyah Ensan, S. Mandal, Swaroop Ghosh, A. Chattopadhyay
{"title":"iMACE: In-Memory Acceleration of Classic McEliece Encoder","authors":"Karthikeyan Nagarajan, Sina Sayyah Ensan, S. Mandal, Swaroop Ghosh, A. Chattopadhyay","doi":"10.1109/ISVLSI.2019.00098","DOIUrl":"https://doi.org/10.1109/ISVLSI.2019.00098","url":null,"abstract":"Asymmetric code-based crypto-systems have been developed in the last decade due to rapid evolution of quantum computing that can potentially compromise RSA and ECC based crypto-systems. The McEliece crypto-system based on the general decoding problem is one of the front runner candidates for post-quantum cryptography but the energy-efficiency is limited by the heavy data traffic between the processing elements and the memory. In memory-computing (IMC) architectures can remove the energy-efficiency barriers posed by Von-Neumann computing due to movement of data between the processor and the memory. Emerging non-volatile memories (NVM) such as, Resistive RAM (ReRAM) implemented in a crossbar array are promising substrates to realize IMC due to excellent High Resistance State (HRS) to Low Resistance State (LRS) ratios and high-densities. Therefore, McEliece can be benefited substantially by in-memory acceleration. We propose, iMACE, a high performance and area-efficient hardware implementation of the core encoding function of McEliece by exploiting ReRAM-based IMC. Simulation results show 18.8X-94X better throughput and 46%-97% reduction in energy consumption compared to the FPGA-based implementation.","PeriodicalId":6703,"journal":{"name":"2019 IEEE Computer Society Annual Symposium on VLSI (ISVLSI)","volume":"122 1","pages":"513-518"},"PeriodicalIF":0.0,"publicationDate":"2019-07-15","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"79106215","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Design of a Hierarchical Clos-Benes Optical Network-on-Chip Architecture","authors":"Renjie Yao, Yaoyao Ye, Weichen Liu","doi":"10.1109/ISVLSI.2019.00100","DOIUrl":"https://doi.org/10.1109/ISVLSI.2019.00100","url":null,"abstract":"As chip multiprocessors keep growing in capability, on-chip communication efficiency is crucial to the overall performance. However, on-chip networks based on electronic switches suffer from excessive power consumption and limited performance. In order to take advantages of optical interconnect for large-scale on-chip communication in chip multiprocessors, we propose a design of hierarchical Clos-Benes optical network-on-chip (NoC) with an optimized control and routing scheme. The proposed control and routing scheme includes a priority based round-robin virtual output queue selection and a Q-learning based heuristic routing algorithm for the Clos network, and a traffic-aware adaptive routing for the intra-switch Benes network. By taking network load and runtime path allocation into account, the proposed Q-learning based heuristic routing can finally predict the best alternative path among all possible available paths with a much better path allocation success rate. A case study on a 256-core chip multiprocessor under uniform traffic shows that the network throughput is increased by 400%, 60%, and 16% respectively than the mesh, fattree and the baseline Clos-Benes optical NoC. On average of a set of real applications, the application ETE delay is reduced by 48%, 29%, and 20% respectively than the mesh, fattree and the baseline Clos-Benes network.","PeriodicalId":6703,"journal":{"name":"2019 IEEE Computer Society Annual Symposium on VLSI (ISVLSI)","volume":"17 1","pages":"523-528"},"PeriodicalIF":0.0,"publicationDate":"2019-07-15","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"81861268","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}