2020 IEEE Computer Society Annual Symposium on VLSI (ISVLSI)最新文献_第10页

Efficient Hardware Implementation of Artificial Neural Networks Using Approximate Multiply-Accumulate Blocks 基于近似乘法累加块的人工神经网络的高效硬件实现

2020 IEEE Computer Society Annual Symposium on VLSI (ISVLSI) Pub Date : 2020-07-01 DOI: 10.1109/isvlsi49217.2020.00027

Mohammadreza Esmali Nojehdeh, L. Aksoy, M. Altun

{"title":"Efficient Hardware Implementation of Artificial Neural Networks Using Approximate Multiply-Accumulate Blocks","authors":"Mohammadreza Esmali Nojehdeh, L. Aksoy, M. Altun","doi":"10.1109/isvlsi49217.2020.00027","DOIUrl":"https://doi.org/10.1109/isvlsi49217.2020.00027","url":null,"abstract":"In this paper, we explore efficient hardware implementation of feedforward artificial neural networks (ANNs) using approximate adders and multipliers. We also introduce an approximate multiplier with a simple structure leading to a considerable reduction in the ANN hardware complexity. Due to a large area requirement in a parallel architecture, the ANNs are implemented under the time-multiplexed architecture where computing resources are re-used in the multiply-accumulate (MAC) blocks. The efficient hardware implementation of ANNs is realized by replacing the exact adders and multipliers in the MAC blocks by the approximate ones taking into account the hardware accuracy. Experimental results show that the ANNs designed using the proposed approximate multiplier have smaller area and consume less energy than those designed using previously proposed prominent approximate multipliers. It is also observed that the use of both approximate adders and multipliers yields respectively up to a 64% and 43% reduction in energy consumption and area of the ANN design with a slight decrease in the hardware accuracy when compared to the exact adders and multipliers.","PeriodicalId":423851,"journal":{"name":"2020 IEEE Computer Society Annual Symposium on VLSI (ISVLSI)","volume":"11 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2020-07-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"128257093","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 2

High Level Modeling of Memristive Crossbar Arrays 忆阻交叉棒阵列的高级建模

2020 IEEE Computer Society Annual Symposium on VLSI (ISVLSI) Pub Date : 2020-07-01 DOI: 10.1109/isvlsi49217.2020.000-3

Md. Adnan Zaman, Rajeev Joshi, S. Katkoori

{"title":"High Level Modeling of Memristive Crossbar Arrays","authors":"Md. Adnan Zaman, Rajeev Joshi, S. Katkoori","doi":"10.1109/isvlsi49217.2020.000-3","DOIUrl":"https://doi.org/10.1109/isvlsi49217.2020.000-3","url":null,"abstract":"Crossbar architecture is one of the prominent candidates to enable memristor based in-memory computing. Recent literature suggests that predominantly SPICE level simulations have been performed to check the correctness of the memristive systems. Though SPICE simulation gives accurate results, it takes a substantial amount of time as circuit complexity increases. Currently, memristor mapping tools (such as SIMPLER MAGIC) are not guaranteed to generate a correct design by construction as they do not provide any formal proof for their corresponding tools. The aforementioned reasons motivate us to come up with a behavioral model of the memristive system. We use two processes to model the memristor-one to decide the final signal value when multiple sources drive it. Another process decides the final states of the memristors. The proposed model along with the control voltage sequence and initial states of memristors allows us to quickly verify the functionality of the memristive system using VHDL based simulation. While several SPICE level models are available, to the best of our knowledge, this is the first work that proposes a behavioral VHDL model of memristor. To validate our proposed approach, we compare our model with a SPICE based model in terms of functional correctness and runtime speedups, experimental evaluation on thirteen (13) different combinational benchmark circuits resulted in runtime speedups of 140X on average with 8X-205X range.","PeriodicalId":423851,"journal":{"name":"2020 IEEE Computer Society Annual Symposium on VLSI (ISVLSI)","volume":"227 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2020-07-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"131346433","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 1

Fast Linear Programming Optimization Using Crossbar-Based Analog Accelerator 基于交叉杆模拟加速器的快速线性规划优化

2020 IEEE Computer Society Annual Symposium on VLSI (ISVLSI) Pub Date : 2020-07-01 DOI: 10.1109/isvlsi49217.2020.00057

Liuting Shang, Muhammad Adil, Ramtin Madani, C. Pan

{"title":"Fast Linear Programming Optimization Using Crossbar-Based Analog Accelerator","authors":"Liuting Shang, Muhammad Adil, Ramtin Madani, C. Pan","doi":"10.1109/isvlsi49217.2020.00057","DOIUrl":"https://doi.org/10.1109/isvlsi49217.2020.00057","url":null,"abstract":"Linear programming optimization is critical to logistics management, engineering designs, and decision making in every area of the economy. Traditional hardware that using GPU and CPU platforms for this purpose is significantly limited by the scaling transistor size. In this paper, an analog in-memory computation circuit is proposed to accelerate linear programming optimization problems. The proposed scheme includes a memristor crossbar array and analogue peripheral circuits that do not need ADC/DAC between each iteration of the algorithm. In addition, we discuss several key parameters related to interconnect parasitics and non-ideal device characteristics to provide practical guidelines. Furthermore, we propose three design schemes to mitigate the computation error that comes from the interconnect resistance in a large-scale crossbar array implementation. Optimal design parameters are quantitatively analyzed under a given number of memristance and array size. It is demonstrated that the proposed accelerator achieves energy consumption, area and delay reductions of ~ 21×, ~151× and ~ 33×, respectively, compared to the 16nm-technology CMOS digital circuits for a 1000×1000 array with a precision of 6-bit","PeriodicalId":423851,"journal":{"name":"2020 IEEE Computer Society Annual Symposium on VLSI (ISVLSI)","volume":"110 5","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2020-07-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"113961771","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 0

A 2^7 -1 Low-Power Half-Rate 16-Gb/s Charge-Mode PRBS Generator in 1.2V, 65nm CMOS 2^7 -1低功耗半速率16gb /s充电模式PRBS发生器，1.2V, 65nm CMOS

2020 IEEE Computer Society Annual Symposium on VLSI (ISVLSI) Pub Date : 2020-07-01 DOI: 10.1109/isvlsi49217.2020.00046

Prema Kumar Govindaswamy, V. Pasupureddi

引用次数: 1

Inference and Energy Efficient Design of Deep Neural Networks for Embedded Devices 嵌入式设备中深度神经网络的推理与节能设计

2020 IEEE Computer Society Annual Symposium on VLSI (ISVLSI) Pub Date : 2020-07-01 DOI: 10.1109/isvlsi49217.2020.00017

Ioannis Galanis, Iraklis Anagnostopoulos, Chinh Nguyen, Guillermo Bares, Dona Burkard

引用次数: 0

A Multi-grained Reconfigurable Accelerator for Approximate Computing 面向近似计算的多粒度可重构加速器

2020 IEEE Computer Society Annual Symposium on VLSI (ISVLSI) Pub Date : 2020-07-01 DOI: 10.1109/isvlsi49217.2020.00026

Yirong Kan, Man Wu, Renyuan Zhang, Y. Nakashima

{"title":"A Multi-grained Reconfigurable Accelerator for Approximate Computing","authors":"Yirong Kan, Man Wu, Renyuan Zhang, Y. Nakashima","doi":"10.1109/isvlsi49217.2020.00026","DOIUrl":"https://doi.org/10.1109/isvlsi49217.2020.00026","url":null,"abstract":"An elastic neural network is implemented by FPGA for constructing the multi-grained reconfigurable accelerator (MGRA). On the basis of a novel bisection neural network (BNN) topology, the entire network on hardware is efficiently partitioned into arbitrary pieces with diamond-like shape (seen as \"DiaNet\") which perform regressions for retrieving arbitrary approximate calculations in parallel. By organizing massive DiaNets, the entire network is reconfigurable in fine-grained (functions of each DiaNet), mid-grained (DiaNet features), and coarse-grained (organization of DiaNets) without redundancy. In this work, a proof-of-concept BNN with 8x8 processing elements (PEs) is implemented by FPGA for performing six calculation units (CU) in parallel. Over various approximate computing tasks with one, two, and three operands, all calculations are retrieved with the inaccuracy less than 3.1%. The maximum hardware utilization of a single CU is reduced to 1.7%, 17.9%, and 7.6% of general arithmetic logic unit (ALU), approximate computing units powered by domain-specific architecture (DSA) and neural network, respectively.","PeriodicalId":423851,"journal":{"name":"2020 IEEE Computer Society Annual Symposium on VLSI (ISVLSI)","volume":"22 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2020-07-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"131353326","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 4

Regulating Degree of Adaptiveness for Performance-Centric NoC Routing 以性能为中心的NoC路由的自适应调节程度

2020 IEEE Computer Society Annual Symposium on VLSI (ISVLSI) Pub Date : 2020-07-01 DOI: 10.1109/isvlsi49217.2020.00007

T. S. Das, Navonil Chatterjee, P. Ghosal

{"title":"Regulating Degree of Adaptiveness for Performance-Centric NoC Routing","authors":"T. S. Das, Navonil Chatterjee, P. Ghosal","doi":"10.1109/isvlsi49217.2020.00007","DOIUrl":"https://doi.org/10.1109/isvlsi49217.2020.00007","url":null,"abstract":"In the network-on-chip (NoC) communication framework, congestion in priority-fixed shortest routes may result in poor network performances in terms of increasing packet latency, and reduced throughput value. Here, the employment of adaptive routing allows more freedom in selecting an alternate congestion-free route in minimal or non-minimal direction. Though the selection of an output link in non-minimal directions based on local congestion information may also degrade network performance rather than improving due to the increasing number of resource sharer in a longer route. Moreover, packet routing using a longer route may not support guaranteed throughput (GT) intensive real-time applications. In addition, allowing freedom in the non-minimal route increases the chance of occurring deadlock and live-lock cycles. In this work, we follow an adaptive routing approach that relies on reserving a virtual path for routing packet in both minimal and non-minimal direction while satisfying the application demands in meeting the hard deadline of packet arrival time and guaranteed minimum throughput. In the proposed work, we also investigate to figure out a trade-off between given routing flexibility and overall network performances under the presence of various data traffics. Our experimental results reveal that fixing this range in non-minimal direction at run time is more beneficial than always selecting a specific value, as the deflection range varies based on underlying application demands and present network traffic situation.","PeriodicalId":423851,"journal":{"name":"2020 IEEE Computer Society Annual Symposium on VLSI (ISVLSI)","volume":"24 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2020-07-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"114715698","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 2

3D-Sorter: 3D Design of a Resource-Aware Hardware Sorter for Edge Computing Platforms Under Area and Energy Consumption Constraints 3D-分拣机:面积和能耗约束下边缘计算平台资源感知硬件分拣机的3D设计

2020 IEEE Computer Society Annual Symposium on VLSI (ISVLSI) Pub Date : 2020-07-01 DOI: 10.1109/isvlsi49217.2020.00018

Amin Norollah, Z. Kazemi, D. Hély

引用次数: 1

Tunable Voltage-Mode Subthreshold CMOS Neuron 可调电压模式下阈值CMOS神经元

2020 IEEE Computer Society Annual Symposium on VLSI (ISVLSI) Pub Date : 2020-07-01 DOI: 10.1109/ISVLSI49217.2020.00053

Margherita Ronchini, M. Zamani, H. Farkhani, F. Moradi

引用次数: 6

Leveraging 3D Vertical RRAM to Developing Neuromorphic Architecture for Pattern Classification 利用3D垂直RRAM开发用于模式分类的神经形态架构

2020 IEEE Computer Society Annual Symposium on VLSI (ISVLSI) Pub Date : 2020-07-01 DOI: 10.1109/ISVLSI49217.2020.00054

Bokyung Kim, H. Li

引用次数: 3