Andrea Solazzo, Emanuele Del Sozzo, Irene De Rose, M. Silvestri, Gianluca Durelli, M. Santambrogio
{"title":"Hardware Design Automation of Convolutional Neural Networks","authors":"Andrea Solazzo, Emanuele Del Sozzo, Irene De Rose, M. Silvestri, Gianluca Durelli, M. Santambrogio","doi":"10.1109/ISVLSI.2016.101","DOIUrl":"https://doi.org/10.1109/ISVLSI.2016.101","url":null,"abstract":"Convolutional Neural Networks (CNNs) are a variation of feed-forward Neural Networks inspired by the biological process in the visual cortex of animals. The interest in this supervised learning algorithm has rapidly grown in many fields like image and video recognition and natural language processing. Nowadays they have become the state of the art in various applications like mobile robot vision, video surveillance and Big Data analytics. The specific computation pattern of CNNs results to be highly suitable for hardware acceleration, in fact different types of accelerators have been proposed based on GPU, Field Programmable Gate Array (FPGA) and ASIC. In particular, in the embedded systems context, due to real time and power consumption challenges, it is crucial to find the right tradeoff between performance, energy efficiency, fast development round and cost. This work proposes a framework meant as a tool for the user to accelerate and simplify the design and the implementation of CNNs on FPGAs by leveraging High Level Synthesis, still providing a certain level of customization of the hardware design.","PeriodicalId":140647,"journal":{"name":"2016 IEEE Computer Society Annual Symposium on VLSI (ISVLSI)","volume":"44 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2016-07-11","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"114669830","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
A. Chattopadhyay, V. Pudi, Anubhab Baksi, T. Srikanthan
{"title":"FPGA Based Cyber Security Protocol for Automated Traffic Monitoring Systems: Proposal and Implementation","authors":"A. Chattopadhyay, V. Pudi, Anubhab Baksi, T. Srikanthan","doi":"10.1109/ISVLSI.2016.97","DOIUrl":"https://doi.org/10.1109/ISVLSI.2016.97","url":null,"abstract":"There is a rapidly growing interest in the field of unmanned road vehicles across the world. To aid the traffic management of such systems, there is an urgent need to develop appropriate security protocols facilitating car-to-car and car-to-traffic controller systems. Ensuring security requires both confidentiality (will be understandable only to intended recipients) as well as authenticity (message is not tampered during communication), both of which are taken care of in an Authenticated Encryption with Associated Data (AEAD) scheme. In this paper, we propose a new AEAD-based protocol for secure and authenticated transmission of videos to the base station captured by traffic monitoring systems in real time. Our protocol utilizes ACORN v2, a lightweight AEAD primitive. For the secret key to be used in encryption-authentication, we use the concept of Physically Unclonable Functions (PUFs). The entire protocol is implemented and evaluated with an FPGA-based prototype, using a 640x480 pixel camera with 30 frames per second. The area required for the proposed protocol is 5% of the total FPGA device (Xilinx Zynq-XC7Z020-1clg484).","PeriodicalId":140647,"journal":{"name":"2016 IEEE Computer Society Annual Symposium on VLSI (ISVLSI)","volume":"8 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2016-07-11","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"128124944","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Chenchen Liu, Qing Yang, Bonan Yan, Jianlei Yang, Xiaocong Du, Weijie Zhu, Hao Jiang, Qing Wu, Mark D. Barnell, Hai Helen Li
{"title":"A Memristor Crossbar Based Computing Engine Optimized for High Speed and Accuracy","authors":"Chenchen Liu, Qing Yang, Bonan Yan, Jianlei Yang, Xiaocong Du, Weijie Zhu, Hao Jiang, Qing Wu, Mark D. Barnell, Hai Helen Li","doi":"10.1109/ISVLSI.2016.46","DOIUrl":"https://doi.org/10.1109/ISVLSI.2016.46","url":null,"abstract":"Matrix-vector multiplication, as a key computing operation, has been largely adopted in applications and hence greatly affects the execution efficiency. A common technique to enhance the performance of matrix-vector multiplication is increasing execution parallelism, which results in higher design cost. In recent years, new devices and structures have been widely investigated as alternative solutions. Among them, memristor crossbar demonstrates a great potential for its intrinsic support of matrix-vector multiplication, high integration density, and built-in parallel execution. However, the computation accuracy and speed of such designs are limited and constrained by the features of crossbar array and peripheral circuitry. In this work, we propose a new memristor crossbar based computing engine design by leveraging a current sensing scheme. High operation parallelism and therefore fast computation can be achieved by simultaneously supplying analog voltages into a memristor crossbar and directly detecting weighted currents through current amplifiers. The performance and effectiveness of the proposed design were examined through the implementation of a neural network for pattern recognition based on MNIST database. Compared to a prior reported design, ours increases the recognition accuracy 8.1% (to 94.6%).","PeriodicalId":140647,"journal":{"name":"2016 IEEE Computer Society Annual Symposium on VLSI (ISVLSI)","volume":"114 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2016-07-11","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"133247374","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Attacking an SRAM-Based PUF through Wearout","authors":"A. Roelke, M. Stan","doi":"10.1109/ISVLSI.2016.68","DOIUrl":"https://doi.org/10.1109/ISVLSI.2016.68","url":null,"abstract":"Physical unclonable functions (PUFs) provide a fast and cheap solution to secret key generation. Natural variations in silicon create unique “fingerprints” that are useful for identification. In a 6T SRAM array, these variations cause individual cells to skew their power-on tendency toward storing a 0 or a 1. Wearout effects interfere with those variations, changing the power-on behavior of an SRAM cell in the opposite direction of a stored bit and affecting its reliability as a PUF. In this work, we take advantage of this effect by exposing an SRAM array to high voltage and temperature to activate and accelerate wearout and show that it can cause significant changes to the SRAM's fingerprint. Then we propose an attack on an SRAM PUF that makes use of these conditions with several stored data patterns to modify its fingerprint and then compare the effectiveness of each pattern at producing false negatives for identification challenges. In doing so, we show that false negatives can be increased to 100% in less than 24 hours, effectively erasing the fingerprint and rendering the PUF unusable.","PeriodicalId":140647,"journal":{"name":"2016 IEEE Computer Society Annual Symposium on VLSI (ISVLSI)","volume":"1 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2016-07-11","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"129652716","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"A New Process Variation Monitoring Circuit","authors":"D. Mirzoyan, Ararat Khachatryan","doi":"10.1109/ISVLSI.2016.26","DOIUrl":"https://doi.org/10.1109/ISVLSI.2016.26","url":null,"abstract":"A new process variation monitoring circuit (PVMC) has been proposed in the paper. The goal is to generate a digital signal/code which (code value) will characterize the process corner. The circuit uses only metal-oxide-semiconductor (MOS) transistors to detect variation of their parameters, or process corner by generating digital signals. Process variation is detected based on variation of parameters of n-type MOS transistor, such as threshold voltage, oxide thickness. Proposed circuits' operation is based on a method called “dynamic measurement”, which enables to monitor process variation or corner by using only one type of transistors (n-type or p-type). Absence of devices such as bipolar transistors, diodes and resistors, reference voltage or current (or other parameters) sources leads to increase in detection accuracy and simplicity, as well as circuit area and current reduction. Post-layout simulation results prove correct functionality of the proposed circuit. Although proposed circuit detects process variation/corner by dealing with parameters of n-type MOS transistors, it can be easily modified to include/detect variation of also p-type MOS transistors, resistors or other components.","PeriodicalId":140647,"journal":{"name":"2016 IEEE Computer Society Annual Symposium on VLSI (ISVLSI)","volume":"64 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2016-07-11","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"132646529","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Chinmay Deshpande, Bilgiday Yuce, N. F. Ghalaty, D. Ganta, P. Schaumont, L. Nazhandali
{"title":"A Configurable and Lightweight Timing Monitor for Fault Attack Detection","authors":"Chinmay Deshpande, Bilgiday Yuce, N. F. Ghalaty, D. Ganta, P. Schaumont, L. Nazhandali","doi":"10.1109/ISVLSI.2016.123","DOIUrl":"https://doi.org/10.1109/ISVLSI.2016.123","url":null,"abstract":"In this paper, we propose a cycle-accurate monitor that can efficiently detect timing violation based fault attacks. The proposed monitor detects clock or voltage manipulations by monitoring the external clock using an internal Ring Oscillator. The monitor is low cost in terms of area and power consumption and can be easily implemented using the standard cell based VLSI design flow. In addition to the architecture of the timing monitor, we present a detailed analysis on the design considerations that affect the cost and accuracy of the monitor. To validate the functionality of the monitor, we implemented it on Spartan-6 FPGA. We also synthesized our monitor onto IBM 90nm ASIC technology to examine the effects of process variation and aging. We show that the proposed method brings 0.23% area and 1.4% power overhead on a reference AES-128 hardware implementation.","PeriodicalId":140647,"journal":{"name":"2016 IEEE Computer Society Annual Symposium on VLSI (ISVLSI)","volume":"31 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2016-07-11","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"121094466","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Adaptive Overclocking and Error Correction Based on Dynamic Speculation Window","authors":"R. Ragavan, C. Killian, O. Sentieys","doi":"10.1109/ISVLSI.2016.13","DOIUrl":"https://doi.org/10.1109/ISVLSI.2016.13","url":null,"abstract":"Error detection and correction based on double-sampling is used as common technique to handle timing errors while scaling Vdd for energy efficiency. An additional sampling element is inserted in the critical paths of the design, to double sample the outputs of those logic paths at different time instances that may fail while scaling the supply voltage or the clock frequency of the design. However, overclocking, and error detection and correction capabilities of the double sampling methods are limited due to the fixed speculation window which lacks adaptability for tracking variations such as temperature. In this paper, we introduce a dynamic speculation window to be used in double sampling schemes for timing error detection and correction in pipelined logic paths. The proposed method employs online slack measurement and conventional shadow flipflop approach to adaptively overclock or underclock the design and also to detect and correct timing errors due to temperature and other variability effects. We demonstrate this method in the Xilinx Virtex VC707 FPGA for various benchmarks. We achieve a maximum of 71% overclocking with a limited area overhead of 1.9% LUTs and 1.7% flip-flops.","PeriodicalId":140647,"journal":{"name":"2016 IEEE Computer Society Annual Symposium on VLSI (ISVLSI)","volume":"60 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2016-07-11","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"115510263","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Satyajit Das, T. Peyret, Kevin J. M. Martin, G. Corre, M. Thévenin, P. Coussy
{"title":"A Scalable Design Approach to Efficiently Map Applications on CGRAs","authors":"Satyajit Das, T. Peyret, Kevin J. M. Martin, G. Corre, M. Thévenin, P. Coussy","doi":"10.1109/ISVLSI.2016.54","DOIUrl":"https://doi.org/10.1109/ISVLSI.2016.54","url":null,"abstract":"Coarse-Grained Reconfigurable Architectures (CGRAs) are promising high-performance and power-efficient platforms. However, their uses are still limited because of the current capability of the mapping tools. This paper presents a new scalable efficient design flow to map applications written in high level language on CGRAs. This approach leverages on simultaneous scheduling and binding steps respectively based on a heuristic and an exact method stochastically degenerated. The formal graph model of the application, obtained after compilation, is backward traversed and dynamically transformed when needed to allow for a better exploration of the design space. Results show that our approach is scalable, finds most of the time the best solutions i.e. the mappings with the shortest latencies, achieves lowest failure rate in carrying out solutions, provides lower computation time and explores more efficiently the solution space than the state of the art methods.","PeriodicalId":140647,"journal":{"name":"2016 IEEE Computer Society Annual Symposium on VLSI (ISVLSI)","volume":"40 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2016-07-11","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"128258914","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Mesbah Uddin, M. Majumder, G. Rose, K. Beckmann, H. Manem, Z. Alamgir, N. Cady
{"title":"Techniques for Improved Reliability in Memristive Crossbar PUF Circuits","authors":"Mesbah Uddin, M. Majumder, G. Rose, K. Beckmann, H. Manem, Z. Alamgir, N. Cady","doi":"10.1109/ISVLSI.2016.33","DOIUrl":"https://doi.org/10.1109/ISVLSI.2016.33","url":null,"abstract":"Hardware security has emerged as an important field of study aimed at mitigating issues such as integrated circuit (IC) piracy and counterfeiting. One popular solution for such hardware security attacks are physically unclonable functions (PUF) which provide a hardware specific unique identification based on intrinsic process variations within individual integrated circuit implementations. At the same time, as technology scaling progresses further into the nanometer region, emerging nanoelectronic technologies such as memristors become viable options. Several examples of nanoelectronic memristor-based PUF circuits have been proposed in the last few years. In this paper, we analyze the behavior of crossbar memristive PUF circuits under different environmental conditions such as varying temperature, supply rail voltage fluctuations and aging. We also present an approach that improves the reliability of these circuits, taking environmental variations into consideration. The advantages and challenges associated with these PUFs are also discussed in detail. Specifically, we show results for security metrics including reliability, uniqueness and uniformity. These security performance results are presented alongside estimates for power, area and delay showing the advantages of using nanoelectronic PUFs from the perspective of efficient resource utilization.","PeriodicalId":140647,"journal":{"name":"2016 IEEE Computer Society Annual Symposium on VLSI (ISVLSI)","volume":"1 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2016-07-11","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"130092337","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Dylan C. Stow, Itir Akgun, Russell Barnes, P. Gu, Yuan Xie
{"title":"Cost and Thermal Analysis of High-Performance 2.5D and 3D Integrated Circuit Design Space","authors":"Dylan C. Stow, Itir Akgun, Russell Barnes, P. Gu, Yuan Xie","doi":"10.1109/ISVLSI.2016.133","DOIUrl":"https://doi.org/10.1109/ISVLSI.2016.133","url":null,"abstract":"3D Integration is a promising technology to continue the trend of Moore's law. However, higher density from die stacking introduces thermal challenges that require more expensive packaging and cooling solutions. An alternative integration technology is interposer-based 2.5D design, which has fewer thermal issues but adds extra interposer cost. Designers must be aware of the system-level cost benefits of these choices early in the design process. This paper presents a cost analysis model with wafer costs, 3D bonding costs, and thermal modeling for the optimization of package and cooling costs. The cost model is used to explore the design space of integrated circuits to determine cost-driven enabling points of 2.5D and 3D integration under consideration of design size and power density. Our results suggest that proper use of die-integration technologies can realize substantial cost savings over traditional 2D design, even with the inclusion of packaging and cooling costs. When thermal properties are considered, interposer-based 2.5D integration is predicted to be more cost effective than TSV-based 3D integration, especially when power density is high.","PeriodicalId":140647,"journal":{"name":"2016 IEEE Computer Society Annual Symposium on VLSI (ISVLSI)","volume":"85 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2016-07-11","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"123586087","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}