Proceedings of the 2018 on Great Lakes Symposium on VLSI最新文献_第3页

A Novel Polymorphic Gate Based Circuit Fingerprinting Technique 一种新的基于多态门电路的指纹识别技术

Proceedings of the 2018 on Great Lakes Symposium on VLSI Pub Date : 2018-05-30 DOI: 10.1145/3194554.3194572

Tian Wang, Xiaohui Cui, Dunshan Yu, Omid Aramoon, Timothy Dunlap, G. Qu, Xiaole Cui

{"title":"A Novel Polymorphic Gate Based Circuit Fingerprinting Technique","authors":"Tian Wang, Xiaohui Cui, Dunshan Yu, Omid Aramoon, Timothy Dunlap, G. Qu, Xiaole Cui","doi":"10.1145/3194554.3194572","DOIUrl":"https://doi.org/10.1145/3194554.3194572","url":null,"abstract":"Polymorphic gates are reconfigurable devices that deliver multiple functionalities at different temperature, supply voltage or external inputs. Capable of working in different modes, polymorphic gate is a promising candidate for embedding secret information such as fingerprints. In this paper we report five polymorphic gates whose functionality varies in response to specific control input and propose a circuit fingerprinting scheme based on these gates. The scheme selectively replaces standard logic cells by polymorphic gates whose functionality differs with the standard cells only on Satisfiability Don't Care conditions. Additional dummy fingerprint bits are also introduced to enhance the fingerprint's robustness against attacks such as fingerprint removal and modification. Experimental results on ISCAS and MCNC benchmark circuits demonstrate that our scheme introduces low overhead. More specifically, the average overhead in area, speed and power are 4.04%, 6.97% and 4.15% respectively when we embed 64-bit fingerprint that consists of 32 real fingerprint bits and 32 dummy bits. This is only half of the overhead of the other known approach when they create 32-bit fingerprints.","PeriodicalId":215940,"journal":{"name":"Proceedings of the 2018 on Great Lakes Symposium on VLSI","volume":"40 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2018-05-30","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"116964887","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 3

BiNMAC BiNMAC

Proceedings of the 2018 on Great Lakes Symposium on VLSI Pub Date : 2018-05-30 DOI: 10.1145/3194554.3194634

A. Jafari, M. Hosseini, Adwaya Kulkarni, C. Patel, T. Mohsenin

{"title":"BiNMAC","authors":"A. Jafari, M. Hosseini, Adwaya Kulkarni, C. Patel, T. Mohsenin","doi":"10.1145/3194554.3194634","DOIUrl":"https://doi.org/10.1145/3194554.3194634","url":null,"abstract":"This paper presents a low power, domain-specific manycore accelerator referred to as \"BiNMAC\"- Binarized neural Network Manycore ACcelerator, which effectively maps and executes Binary Deep Neural Networks (BNNs). With only 2.40% and 1.88% area and power overhead, novel instructions such as Population-Count and Patch-Select are added to the ISA of the BiNMAC, each of which replaces frequently used functions that would have taken 52 and 4 clock cycles respectively with 1 clock cycle. A 64-cluster architecture of the BiNMAC is fully placed and routed in 65~nm TSMC CMOS technology, where a single cluster occupies an area of 0.53 mm^2 with a power of 223 mW at 1 GHz clock frequency. The 64-cluster architecture takes 36.5 mm^2 area and, if fully utilized, consumes a power of 16.4 W. We also propose a multilayer perceptron (MLP) neural network for multimodal time-series data classification. Binarized versions of the 3-layers MLP and ResNet-20 are implemented on BiNMAC. The implementation results show that BiNMAC consumes 0.02 mJ and 3.8 mJ energy which is 13 times and 30 times lower than the implementation of standard non-binarized MLP and ResNet-20 on an equivalent predecessor platform. To compare the performance of the BiNMAC with other off-the-shelf platforms, the two networks are also implemented on the NVIDIA Jetson TX2 SoC (CPU+GPU). BiNMAC achieves 22 times and 78 times higher throughput and 23 times and 41 times lower energy consumption compared to TX2 SoC for the binarized MLP and ResNet-20, respectively.","PeriodicalId":215940,"journal":{"name":"Proceedings of the 2018 on Great Lakes Symposium on VLSI","volume":"36 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2018-05-30","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"116279174","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 13

SARO: A State-Aware Reliability Optimization Technique for High Density NAND Flash Memory 基于状态感知的高密度NAND快闪记忆体可靠性优化技术

Proceedings of the 2018 on Great Lakes Symposium on VLSI Pub Date : 2018-05-30 DOI: 10.1145/3194554.3194591

Myungsuk Kim, Youngsun Song, Myoungsoo Jung, Jihong Kim

{"title":"SARO: A State-Aware Reliability Optimization Technique for High Density NAND Flash Memory","authors":"Myungsuk Kim, Youngsun Song, Myoungsoo Jung, Jihong Kim","doi":"10.1145/3194554.3194591","DOIUrl":"https://doi.org/10.1145/3194554.3194591","url":null,"abstract":"Recent advances in flash technologies, such as scaling and multi-leveling schemes, have been successful to make flash denser and secure more storage spaces per die. Unfortunately, these technology advances significantly degrade flash's reliability due to a smaller cell geometry and a finer-grained cell state control. In this paper, we propose a state-aware reliability optimization technique SARO), new flash optimization that improves the flash reliability under diverse scaling and multi-leveling schemes. To this end, we first reveal that reliability-related flash errors are highly skewed among flash cell states, which was not captured by prior studies. The proposed SARO exploits then the different per-state error behavior in flash cell states by selecting the most error-prone flash states (for each error type) and by forming narrow threshold voltage distributions(for the selected states only). Furthermore, SARO is applied only when the program time gets shorter because of flash cell aging, thereby keeping the program latency unchanged. Our experimental results with real MLC and TLC flash devices show that SARO can reduce a significant number of flash bit errors, which can in turn reduce the read latency by 40%, on average.","PeriodicalId":215940,"journal":{"name":"Proceedings of the 2018 on Great Lakes Symposium on VLSI","volume":"78 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2018-05-30","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"126305160","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 4

Leveraging RF Power for Intelligent Tag Networks 利用射频功率实现智能标签网络

Proceedings of the 2018 on Great Lakes Symposium on VLSI Pub Date : 2018-05-30 DOI: 10.1145/3194554.3194621

E. Salman, M. Stanaćević, Samir R Das, P. Djurić

引用次数: 7

Evaluation of the Complexity of Automated Trace Alignment using Novel Power Obfuscation Methods 基于新型功率模糊方法的自动轨迹对准复杂性评估

Proceedings of the 2018 on Great Lakes Symposium on VLSI Pub Date : 2018-05-30 DOI: 10.1145/3194554.3194640

Bozhi Liu, Kemeng Chen, Minjun Seo, Janet Roveda, Roman L. Lysecky

引用次数: 1

Performance and Energy Enhancement through an Online Single/Multi Level Mode Switching Cache Architecture 通过在线单级/多级模式切换高速缓存架构提升性能和能耗

Proceedings of the 2018 on Great Lakes Symposium on VLSI Pub Date : 2018-05-30 DOI: 10.1145/3194554.3194599

Ramin Rezaeizadeh Rookerd, Somayeh Sadeghi Kohan, Z. Navabi

{"title":"Performance and Energy Enhancement through an Online Single/Multi Level Mode Switching Cache Architecture","authors":"Ramin Rezaeizadeh Rookerd, Somayeh Sadeghi Kohan, Z. Navabi","doi":"10.1145/3194554.3194599","DOIUrl":"https://doi.org/10.1145/3194554.3194599","url":null,"abstract":"STT-RAM cells can be considered as an alternative or a hybrid addition to today's SRAM-based cache memories. This is mostly because of their scalability and low leakage power. Moreover, their data storing mechanism (storing the value as resistance) makes them very suitable and applicable for multivalue cache architectures. This feature results in system performance enhancement without any area overhead. On the other hand, the required two-step read/write procedure in multilevel cells results in a non-uniform time access and energy and power overhead on the system. In this paper, we propose a new architecture to dynamically swap data between soft (fast read access) and hard (slow read access) bits in ML cell. Moreover, by reconfiguring cache block size, the proposed architecture can switch between ML and SL modes at runtime. In other words, the swapping method places the hot part of each cache block into soft-bits and the less accessed part into the hard-bits. The SL/ML switching method benefits from the low latency and energy of SL mode and the high storing capacity of ML mode at the same time. Although experimental results show that our proposed method slightly increases the miss rate compared with the conventional ML caches, the performance and energy are improved by 4.9% and 6.5%, respectively. Also, the storage overhead of our method is about 1% that is negligible.","PeriodicalId":215940,"journal":{"name":"Proceedings of the 2018 on Great Lakes Symposium on VLSI","volume":"15 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2018-05-30","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"129193473","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 0

Session details: Special Session 3: Circuits and Systems for Autonomous IoT Devices 特别会议3:自主物联网设备的电路和系统

Proceedings of the 2018 on Great Lakes Symposium on VLSI Pub Date : 2018-05-30 DOI: 10.1145/3252916

E. Salman, M. Stanaćević

引用次数: 0

Session details: Special Session 6: Stochastic and Approximate Computing for Emerging Learning and Communication Systems 专题会议6:新兴学习和通信系统的随机和近似计算

Proceedings of the 2018 on Great Lakes Symposium on VLSI Pub Date : 2018-05-30 DOI: 10.1145/3252919

Jie Han, Yue Zhang

引用次数: 0

A Distributed Parallel Random Walk Algorithm for Large-Scale Capacitance Extraction and Simulation 大规模电容提取与仿真的分布式并行随机游走算法

Proceedings of the 2018 on Great Lakes Symposium on VLSI Pub Date : 2018-05-30 DOI: 10.1145/3194554.3194568

Mingye Song, Zhezhao Xu, Wei Xue, Wenjian Yu

{"title":"A Distributed Parallel Random Walk Algorithm for Large-Scale Capacitance Extraction and Simulation","authors":"Mingye Song, Zhezhao Xu, Wei Xue, Wenjian Yu","doi":"10.1145/3194554.3194568","DOIUrl":"https://doi.org/10.1145/3194554.3194568","url":null,"abstract":"Due to the advantages on scalability and reliability, the floating random walk (FRW) algorithm has been widely adopted for calculating the capacitances among three-dimensional (3-D) conductors. This is evidenced by the industrial practice of interconnect capacitance extraction during the design of high-performance very large-scale integrated (VLSI) circuits. In this work, the FRW algorithm is enhanced through the distributed parallel computing. With an efficient and adaptive task allocation scheme, the communication among different computer nodes is largely reduced. A distributed algorithm for accelerating the space management is also proposed. They have been implemented with Message Passing Interface (MPI) and applied to the high-precision capacitance simulation for touchscreen design and the interconnect capacitance extraction of VLSI circuits. Experiments on a computer cluster show that the proposed techniques achieve up to 114X speedup while using 120 cores, and build up the space management structure for a VLSI case including two million conductor blocks in just 22 seconds (37X parallel speedup on 60 cores).","PeriodicalId":215940,"journal":{"name":"Proceedings of the 2018 on Great Lakes Symposium on VLSI","volume":"1 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2018-05-30","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"130771942","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 6

A Framework Exploiting Process Variability to Improve Energy Efficiency in FPGA Applications 利用过程可变性提高FPGA应用能效的框架

Proceedings of the 2018 on Great Lakes Symposium on VLSI Pub Date : 2018-05-30 DOI: 10.1145/3194554.3194569

Konstantinos Maragos, G. Lentaris, I. Stratakos, D. Soudris

引用次数: 3