2020 IEEE Computer Society Annual Symposium on VLSI (ISVLSI)最新文献

筛选
英文 中文
Efficient Hardware Implementation of Artificial Neural Networks Using Approximate Multiply-Accumulate Blocks 基于近似乘法累加块的人工神经网络的高效硬件实现
2020 IEEE Computer Society Annual Symposium on VLSI (ISVLSI) Pub Date : 2020-07-01 DOI: 10.1109/isvlsi49217.2020.00027
Mohammadreza Esmali Nojehdeh, L. Aksoy, M. Altun
{"title":"Efficient Hardware Implementation of Artificial Neural Networks Using Approximate Multiply-Accumulate Blocks","authors":"Mohammadreza Esmali Nojehdeh, L. Aksoy, M. Altun","doi":"10.1109/isvlsi49217.2020.00027","DOIUrl":"https://doi.org/10.1109/isvlsi49217.2020.00027","url":null,"abstract":"In this paper, we explore efficient hardware implementation of feedforward artificial neural networks (ANNs) using approximate adders and multipliers. We also introduce an approximate multiplier with a simple structure leading to a considerable reduction in the ANN hardware complexity. Due to a large area requirement in a parallel architecture, the ANNs are implemented under the time-multiplexed architecture where computing resources are re-used in the multiply-accumulate (MAC) blocks. The efficient hardware implementation of ANNs is realized by replacing the exact adders and multipliers in the MAC blocks by the approximate ones taking into account the hardware accuracy. Experimental results show that the ANNs designed using the proposed approximate multiplier have smaller area and consume less energy than those designed using previously proposed prominent approximate multipliers. It is also observed that the use of both approximate adders and multipliers yields respectively up to a 64% and 43% reduction in energy consumption and area of the ANN design with a slight decrease in the hardware accuracy when compared to the exact adders and multipliers.","PeriodicalId":423851,"journal":{"name":"2020 IEEE Computer Society Annual Symposium on VLSI (ISVLSI)","volume":"11 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2020-07-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"128257093","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 2
High Level Modeling of Memristive Crossbar Arrays 忆阻交叉棒阵列的高级建模
2020 IEEE Computer Society Annual Symposium on VLSI (ISVLSI) Pub Date : 2020-07-01 DOI: 10.1109/isvlsi49217.2020.000-3
Md. Adnan Zaman, Rajeev Joshi, S. Katkoori
{"title":"High Level Modeling of Memristive Crossbar Arrays","authors":"Md. Adnan Zaman, Rajeev Joshi, S. Katkoori","doi":"10.1109/isvlsi49217.2020.000-3","DOIUrl":"https://doi.org/10.1109/isvlsi49217.2020.000-3","url":null,"abstract":"Crossbar architecture is one of the prominent candidates to enable memristor based in-memory computing. Recent literature suggests that predominantly SPICE level simulations have been performed to check the correctness of the memristive systems. Though SPICE simulation gives accurate results, it takes a substantial amount of time as circuit complexity increases. Currently, memristor mapping tools (such as SIMPLER MAGIC) are not guaranteed to generate a correct design by construction as they do not provide any formal proof for their corresponding tools. The aforementioned reasons motivate us to come up with a behavioral model of the memristive system. We use two processes to model the memristor-one to decide the final signal value when multiple sources drive it. Another process decides the final states of the memristors. The proposed model along with the control voltage sequence and initial states of memristors allows us to quickly verify the functionality of the memristive system using VHDL based simulation. While several SPICE level models are available, to the best of our knowledge, this is the first work that proposes a behavioral VHDL model of memristor. To validate our proposed approach, we compare our model with a SPICE based model in terms of functional correctness and runtime speedups, experimental evaluation on thirteen (13) different combinational benchmark circuits resulted in runtime speedups of 140X on average with 8X-205X range.","PeriodicalId":423851,"journal":{"name":"2020 IEEE Computer Society Annual Symposium on VLSI (ISVLSI)","volume":"227 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2020-07-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"131346433","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 1
Fast Linear Programming Optimization Using Crossbar-Based Analog Accelerator 基于交叉杆模拟加速器的快速线性规划优化
2020 IEEE Computer Society Annual Symposium on VLSI (ISVLSI) Pub Date : 2020-07-01 DOI: 10.1109/isvlsi49217.2020.00057
Liuting Shang, Muhammad Adil, Ramtin Madani, C. Pan
{"title":"Fast Linear Programming Optimization Using Crossbar-Based Analog Accelerator","authors":"Liuting Shang, Muhammad Adil, Ramtin Madani, C. Pan","doi":"10.1109/isvlsi49217.2020.00057","DOIUrl":"https://doi.org/10.1109/isvlsi49217.2020.00057","url":null,"abstract":"Linear programming optimization is critical to logistics management, engineering designs, and decision making in every area of the economy. Traditional hardware that using GPU and CPU platforms for this purpose is significantly limited by the scaling transistor size. In this paper, an analog in-memory computation circuit is proposed to accelerate linear programming optimization problems. The proposed scheme includes a memristor crossbar array and analogue peripheral circuits that do not need ADC/DAC between each iteration of the algorithm. In addition, we discuss several key parameters related to interconnect parasitics and non-ideal device characteristics to provide practical guidelines. Furthermore, we propose three design schemes to mitigate the computation error that comes from the interconnect resistance in a large-scale crossbar array implementation. Optimal design parameters are quantitatively analyzed under a given number of memristance and array size. It is demonstrated that the proposed accelerator achieves energy consumption, area and delay reductions of ~ 21×, ~151× and ~ 33×, respectively, compared to the 16nm-technology CMOS digital circuits for a 1000×1000 array with a precision of 6-bit","PeriodicalId":423851,"journal":{"name":"2020 IEEE Computer Society Annual Symposium on VLSI (ISVLSI)","volume":"110 5","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2020-07-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"113961771","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
A 2^7 -1 Low-Power Half-Rate 16-Gb/s Charge-Mode PRBS Generator in 1.2V, 65nm CMOS 2^7 -1低功耗半速率16gb /s充电模式PRBS发生器,1.2V, 65nm CMOS
2020 IEEE Computer Society Annual Symposium on VLSI (ISVLSI) Pub Date : 2020-07-01 DOI: 10.1109/isvlsi49217.2020.00046
Prema Kumar Govindaswamy, V. Pasupureddi
{"title":"A 2^7 -1 Low-Power Half-Rate 16-Gb/s Charge-Mode PRBS Generator in 1.2V, 65nm CMOS","authors":"Prema Kumar Govindaswamy, V. Pasupureddi","doi":"10.1109/isvlsi49217.2020.00046","DOIUrl":"https://doi.org/10.1109/isvlsi49217.2020.00046","url":null,"abstract":"In this work, we propose a half-rate 2^7-1 pseudo random bit sequence(PRBS) generator by employing highly power efficient charge-mode circuit topology at 16-Gb/s. At the target data-rate, proposed charge-mode implementation have the lowest power consumption compared to the traditional currentmode PRBS generator implementations, thanks to the availability of high speed switches in sub-100nm technologies. The proposed charge-mode half-rate PRBS generator is implemented in 1.2 V, 65-nm CMOS technology with a power consumption of 3.35 mW, timing jitter of 0.2 ps and FoM of 0.02-pJ/bit at 16-Gb/s. Thus, the proposed power efficient charge-mode implementation of PRBS generator is an attractive candidate for on-chip biterror-rate(BER) test and measurement applications.","PeriodicalId":423851,"journal":{"name":"2020 IEEE Computer Society Annual Symposium on VLSI (ISVLSI)","volume":"198 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2020-07-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"133556367","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 1
Inference and Energy Efficient Design of Deep Neural Networks for Embedded Devices 嵌入式设备中深度神经网络的推理与节能设计
2020 IEEE Computer Society Annual Symposium on VLSI (ISVLSI) Pub Date : 2020-07-01 DOI: 10.1109/isvlsi49217.2020.00017
Ioannis Galanis, Iraklis Anagnostopoulos, Chinh Nguyen, Guillermo Bares, Dona Burkard
{"title":"Inference and Energy Efficient Design of Deep Neural Networks for Embedded Devices","authors":"Ioannis Galanis, Iraklis Anagnostopoulos, Chinh Nguyen, Guillermo Bares, Dona Burkard","doi":"10.1109/isvlsi49217.2020.00017","DOIUrl":"https://doi.org/10.1109/isvlsi49217.2020.00017","url":null,"abstract":"Deep/Convolutional Neural Networks (DNNs/CNNs) are deployed on resource-constraint embedded devices in order to serve popular computer vision applications. However, DNNs have increased computing requirements and battery-operated devices suffer to deliver acceptable performance. In this paper, we present an efficient design of DNNs for edge devices that performs a DNN architectural search. Our method finds alternative designs of DNNs that have lower energy consumption and inference time than ResNet reference networks. Experimental results show up to 78.82% reduction in energy consumption and 35.71% in inference time, while training up to 95.67% fewer networks. As a trade-off, our approach compromises the user Quality of Service up to 2% compared to the reference networks.","PeriodicalId":423851,"journal":{"name":"2020 IEEE Computer Society Annual Symposium on VLSI (ISVLSI)","volume":"4 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2020-07-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"130994167","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
A Multi-grained Reconfigurable Accelerator for Approximate Computing 面向近似计算的多粒度可重构加速器
2020 IEEE Computer Society Annual Symposium on VLSI (ISVLSI) Pub Date : 2020-07-01 DOI: 10.1109/isvlsi49217.2020.00026
Yirong Kan, Man Wu, Renyuan Zhang, Y. Nakashima
{"title":"A Multi-grained Reconfigurable Accelerator for Approximate Computing","authors":"Yirong Kan, Man Wu, Renyuan Zhang, Y. Nakashima","doi":"10.1109/isvlsi49217.2020.00026","DOIUrl":"https://doi.org/10.1109/isvlsi49217.2020.00026","url":null,"abstract":"An elastic neural network is implemented by FPGA for constructing the multi-grained reconfigurable accelerator (MGRA). On the basis of a novel bisection neural network (BNN) topology, the entire network on hardware is efficiently partitioned into arbitrary pieces with diamond-like shape (seen as \"DiaNet\") which perform regressions for retrieving arbitrary approximate calculations in parallel. By organizing massive DiaNets, the entire network is reconfigurable in fine-grained (functions of each DiaNet), mid-grained (DiaNet features), and coarse-grained (organization of DiaNets) without redundancy. In this work, a proof-of-concept BNN with 8x8 processing elements (PEs) is implemented by FPGA for performing six calculation units (CU) in parallel. Over various approximate computing tasks with one, two, and three operands, all calculations are retrieved with the inaccuracy less than 3.1%. The maximum hardware utilization of a single CU is reduced to 1.7%, 17.9%, and 7.6% of general arithmetic logic unit (ALU), approximate computing units powered by domain-specific architecture (DSA) and neural network, respectively.","PeriodicalId":423851,"journal":{"name":"2020 IEEE Computer Society Annual Symposium on VLSI (ISVLSI)","volume":"22 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2020-07-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"131353326","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 4
Regulating Degree of Adaptiveness for Performance-Centric NoC Routing 以性能为中心的NoC路由的自适应调节程度
2020 IEEE Computer Society Annual Symposium on VLSI (ISVLSI) Pub Date : 2020-07-01 DOI: 10.1109/isvlsi49217.2020.00007
T. S. Das, Navonil Chatterjee, P. Ghosal
{"title":"Regulating Degree of Adaptiveness for Performance-Centric NoC Routing","authors":"T. S. Das, Navonil Chatterjee, P. Ghosal","doi":"10.1109/isvlsi49217.2020.00007","DOIUrl":"https://doi.org/10.1109/isvlsi49217.2020.00007","url":null,"abstract":"In the network-on-chip (NoC) communication framework, congestion in priority-fixed shortest routes may result in poor network performances in terms of increasing packet latency, and reduced throughput value. Here, the employment of adaptive routing allows more freedom in selecting an alternate congestion-free route in minimal or non-minimal direction. Though the selection of an output link in non-minimal directions based on local congestion information may also degrade network performance rather than improving due to the increasing number of resource sharer in a longer route. Moreover, packet routing using a longer route may not support guaranteed throughput (GT) intensive real-time applications. In addition, allowing freedom in the non-minimal route increases the chance of occurring deadlock and live-lock cycles. In this work, we follow an adaptive routing approach that relies on reserving a virtual path for routing packet in both minimal and non-minimal direction while satisfying the application demands in meeting the hard deadline of packet arrival time and guaranteed minimum throughput. In the proposed work, we also investigate to figure out a trade-off between given routing flexibility and overall network performances under the presence of various data traffics. Our experimental results reveal that fixing this range in non-minimal direction at run time is more beneficial than always selecting a specific value, as the deflection range varies based on underlying application demands and present network traffic situation.","PeriodicalId":423851,"journal":{"name":"2020 IEEE Computer Society Annual Symposium on VLSI (ISVLSI)","volume":"24 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2020-07-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"114715698","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 2
3D-Sorter: 3D Design of a Resource-Aware Hardware Sorter for Edge Computing Platforms Under Area and Energy Consumption Constraints 3D-分拣机:面积和能耗约束下边缘计算平台资源感知硬件分拣机的3D设计
2020 IEEE Computer Society Annual Symposium on VLSI (ISVLSI) Pub Date : 2020-07-01 DOI: 10.1109/isvlsi49217.2020.00018
Amin Norollah, Z. Kazemi, D. Hély
{"title":"3D-Sorter: 3D Design of a Resource-Aware Hardware Sorter for Edge Computing Platforms Under Area and Energy Consumption Constraints","authors":"Amin Norollah, Z. Kazemi, D. Hély","doi":"10.1109/isvlsi49217.2020.00018","DOIUrl":"https://doi.org/10.1109/isvlsi49217.2020.00018","url":null,"abstract":"In this paper, we proposed a 3-dimensional hardware sorting architecture (3D-Sorter), based on MultiDimensional Sorting Algorithm (MDSA). the proposed architecture transforms a sequence of input records into a 3-dimensional matrix. Records of every dimension are sorted in several MDSA phases, using partial sorting methods. Our synthesis results, provided by Xilinx Vivado indicate that the 3D-Sorter design decreases the number of Look-Up Tables (LUT) and registers by 54% and 42.7%, compared to the state-of-the-art hardware sorter. Also, the power consumption is reduced by 48.15% on average. The results show that the proposed architecture is a remarkable power/area saving for edge components.","PeriodicalId":423851,"journal":{"name":"2020 IEEE Computer Society Annual Symposium on VLSI (ISVLSI)","volume":"5 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2020-07-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"117194288","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 1
Tunable Voltage-Mode Subthreshold CMOS Neuron 可调电压模式下阈值CMOS神经元
2020 IEEE Computer Society Annual Symposium on VLSI (ISVLSI) Pub Date : 2020-07-01 DOI: 10.1109/ISVLSI49217.2020.00053
Margherita Ronchini, M. Zamani, H. Farkhani, F. Moradi
{"title":"Tunable Voltage-Mode Subthreshold CMOS Neuron","authors":"Margherita Ronchini, M. Zamani, H. Farkhani, F. Moradi","doi":"10.1109/ISVLSI49217.2020.00053","DOIUrl":"https://doi.org/10.1109/ISVLSI49217.2020.00053","url":null,"abstract":"To address the ever-increasing computational demands of machine learning applications, neuromorphic computing has emerged as a possible solution. The goal is to design a platform able to mimic the processing strategies of the brain. A neuromorphic system is composed by artificial neurons and synapses implemented in hardware with high level of integration. Such implementations entail challenges including power-efficiency, compactness and biophysical resemblance. This work proposes a new implementation of a neuron circuit, initially introduced by Wijekoon and Dudek. We show that the proposed neuron, designed in a standard 0.18µm CMOS process, consumes 58.5fJ/spike at 0.2V supply voltage. The area covered by the circuit is 16.8% of the area of the state-of-the-art implementation. This result was achieved by lowering the membrane capacitance and the number of transistors. In addition, spiking activity unfolds on a biological time scale - rather than accelerated. The circuit preserves the possibility of being adjusted by external biases to attain different firing patterns.","PeriodicalId":423851,"journal":{"name":"2020 IEEE Computer Society Annual Symposium on VLSI (ISVLSI)","volume":"53 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2020-07-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"132136563","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 6
Leveraging 3D Vertical RRAM to Developing Neuromorphic Architecture for Pattern Classification 利用3D垂直RRAM开发用于模式分类的神经形态架构
2020 IEEE Computer Society Annual Symposium on VLSI (ISVLSI) Pub Date : 2020-07-01 DOI: 10.1109/ISVLSI49217.2020.00054
Bokyung Kim, H. Li
{"title":"Leveraging 3D Vertical RRAM to Developing Neuromorphic Architecture for Pattern Classification","authors":"Bokyung Kim, H. Li","doi":"10.1109/ISVLSI49217.2020.00054","DOIUrl":"https://doi.org/10.1109/ISVLSI49217.2020.00054","url":null,"abstract":"The crossbar architecture with resistive random-access memory (RRAM) devices presents many advantages in realizing matrix-based computations and achieves success in neural network implementation. However, the rapid growth of network size demands even denser structures. In this paper, we investigate the neuromorphic hardware design based on the three-dimensional vertical RRAM (3D VRRAM) with an even/odd word line (WL) structure. The increased interconnects of VRRAM aggravate the chronic problems of the crossbar structure like the sneak path currents. We address this issue by attaining a balanced structure with high nonlinear RRAM devices. Furthermore, the impact of complicated signal routing and control due to the vertically stacked structure can be alleviated through architectural level optimization. A three-layer VRRAM structure is demonstrated for neuromorphic design by showing that 8X8-pixel images were successfully classified into three alphabet characters on this structure. The example design also verifies that the 3D VRRAM with even/odd WL structure is beneficial to acquire high area efficiency.","PeriodicalId":423851,"journal":{"name":"2020 IEEE Computer Society Annual Symposium on VLSI (ISVLSI)","volume":"481 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2020-07-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"132568325","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 3
0
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
确定
请完成安全验证×
相关产品
×
本文献相关产品
联系我们:info@booksci.cn Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。 Copyright © 2023 布克学术 All rights reserved.
京ICP备2023020795号-1
ghs 京公网安备 11010802042870号
Book学术文献互助
Book学术文献互助群
群 号:604180095
Book学术官方微信