2021 IEEE Workshop on Signal Processing Systems (SiPS)最新文献_第4页

A Memory-Efficient Hardware Architecture for Deformable Convolutional Networks 一种可变形卷积网络的内存高效硬件架构

2021 IEEE Workshop on Signal Processing Systems (SiPS) Pub Date : 2021-10-01 DOI: 10.1109/SiPS52927.2021.00033

Yue Yu, Jiapeng Luo, W. Mao, Zhongfeng Wang

{"title":"A Memory-Efficient Hardware Architecture for Deformable Convolutional Networks","authors":"Yue Yu, Jiapeng Luo, W. Mao, Zhongfeng Wang","doi":"10.1109/SiPS52927.2021.00033","DOIUrl":"https://doi.org/10.1109/SiPS52927.2021.00033","url":null,"abstract":"In recent years, deformable convolutional networks are widely adopted in object detection tasks and have achieved outstanding performance. Compared with conventional convolution, the deformable convolution has an irregular receptive field to adapt to objects with different sizes and shapes. However, the irregularity of the receptive field causes inefficient access to memory and increases the complexity of control logic. Toward hardware-friendly implementation, prior works change the characteristics of deformable convolution by restricting the receptive field, leading to accuracy degradation. In this paper, we develop a dedicated Sampling Core to sample and rearrange the input pixels, enabling the convolution array to access the inputs regularly. In addition, a memory-efficient dataflow is introduced to match the processing speed of the Sampling Core and convolutional array, which improves hardware utilization and reduces access to off-chip memory. Based on these optimizations, we propose a novel hardware architecture for the deformable convolution network, which is the first work to accelerate the original deformable convolution network. With the design of the memory-efficient architecture, the access to the off-chip memory is reduced significantly. We implement it on Xilinx Virtex-7 FPGA, and experiments show that the energy efficiency reaches 50.29 GOPS/W, which is 2.5 times higher compared with executing the same network on GPU.","PeriodicalId":103894,"journal":{"name":"2021 IEEE Workshop on Signal Processing Systems (SiPS)","volume":"1 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2021-10-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"130240482","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 5

Exploration of Energy-Efficient Architecture for Graph-Based Point-Cloud Deep Learning 基于图的点云深度学习节能架构探索

2021 IEEE Workshop on Signal Processing Systems (SiPS) Pub Date : 2021-10-01 DOI: 10.1109/SiPS52927.2021.00054

Jie-Fang Zhang, Zhengya Zhang

引用次数: 1

[Copyright notice] (版权)

2021 IEEE Workshop on Signal Processing Systems (SiPS) Pub Date : 2021-10-01 DOI: 10.1109/sips52927.2021.00003

引用次数: 0

A Scalable Generator for Massive MIMO Baseband Processing Systems with Beamspace Channel Estimation 基于波束空间信道估计的大规模MIMO基带处理系统的可扩展发生器

2021 IEEE Workshop on Signal Processing Systems (SiPS) Pub Date : 2021-10-01 DOI: 10.1109/SiPS52927.2021.00040

Yue Dai, Harrison Liew, M. Rasekh, Seyed Hadi Mirfarshbafan, Alexandra Gallyas-Sanhueza, James Dunn, Upamanyu Madhow, Christoph Studer, B. Nikolić

引用次数: 1

Automatic Generation of Dynamic Inference Architecture for Deep Neural Networks 深度神经网络动态推理体系结构的自动生成

2021 IEEE Workshop on Signal Processing Systems (SiPS) Pub Date : 2021-10-01 DOI: 10.1109/SiPS52927.2021.00029

Shize Zhao, Liulu He, Xiaoru Xie, Jun Lin, Zhongfeng Wang

{"title":"Automatic Generation of Dynamic Inference Architecture for Deep Neural Networks","authors":"Shize Zhao, Liulu He, Xiaoru Xie, Jun Lin, Zhongfeng Wang","doi":"10.1109/SiPS52927.2021.00029","DOIUrl":"https://doi.org/10.1109/SiPS52927.2021.00029","url":null,"abstract":"The computational cost of deep neural network(DNN) model can be reduced dramatically by applying different architectures based on the difficulties of each sample, which is named dynamic inference tech. Manually designed dynamic inference framework is hard to be optimal for the dependency on human experience, which is also time-consuming and labor-intensive. In this paper, we provide an auto-designed AB-Net based on the popular dynamic framework BranchyNet, which is inspired by neural architecture search (NAS). To further accelerate the search procedure, we also develop several specific techniques. Firstly, the search space is optimized by the pre-selection of candidate architectures. Then, a neighborhood greedy search algorithm is developed to efficiently find the optimal architecture in the improved search space. Moreover, our scheme can be extended to the multiple-branch cases to further enhance the performance of the AB-Net. We apply the AB-Net on multiple mainstream models and evaluate them on datasets CIFAR10/100. Compared to the handcrafted BranchyNet, the proposed AB-Net is able to achieve 1.57× computational cost reduction at least even with slight accuracy improvement on CIFAR100. Moreover, the AB-Net also significantly outperforms the S2DNAS on accuracy with similar cost reduction, which is the state-of-the-art automatic dynamic inference architecture.","PeriodicalId":103894,"journal":{"name":"2021 IEEE Workshop on Signal Processing Systems (SiPS)","volume":"46 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2021-10-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"129547743","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 0

Reconfigurable Neural Synaptic Plasticity-Based Stochastic Deep Neural Network Computing 基于可重构神经突触可塑性的随机深度神经网络计算

2021 IEEE Workshop on Signal Processing Systems (SiPS) Pub Date : 2021-10-01 DOI: 10.1109/SiPS52927.2021.00048

Zihan Xia, Ya Dong, Jienan Chen, Rui Wan, Shuai Li, Tingyong Wu

{"title":"Reconfigurable Neural Synaptic Plasticity-Based Stochastic Deep Neural Network Computing","authors":"Zihan Xia, Ya Dong, Jienan Chen, Rui Wan, Shuai Li, Tingyong Wu","doi":"10.1109/SiPS52927.2021.00048","DOIUrl":"https://doi.org/10.1109/SiPS52927.2021.00048","url":null,"abstract":"With the increasing popularity of deep neural networks (DNNs), a large amount of research effort has been devoted to the hardware acceleration of DNNs to achieve efficient processing. Nevertheless, few works have explored the similarities between the biological essence of DNNs and arithmetic circuits. Moreover, stochastic computing (SC), which implements complex arithmetic operations with simple logic gates, has been applied to the acceleration of DNNs. However, traditional SC suffers from high latency and large hardware cost of pseudo-random number generators (PRNGs). Inspired by neural synaptic plasticity and SC, in this work, we present the reconfigurable neural synaptic plasticity-based computing (RNSP) to mimic the biological neuron behaviors and exploit the parallelism of SC to the full extent while maintaining a small hardware footprint compared to fixed-point counterparts. RNSP converts fixed-point numbers to parallel bits without logic resources, which are then synthesized by bit-wise multiplications and some full adders. In addition, we propose the arithmetic unit based on RNSP and use re-training to mitigate the accuracy degradation. Finally, a convolution engine (CE) built on RNSP with high memory bandwidth efficiency is designed. According to the implementation results on FPGA, the proposed RNSP-based CE outperforms the fixed-point counterpart in terms of power consumption and area.","PeriodicalId":103894,"journal":{"name":"2021 IEEE Workshop on Signal Processing Systems (SiPS)","volume":"7 2-3","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2021-10-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"114046739","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 0

Fully Convolutional Network-Based DOA Estimation with Acoustic Vector Sensor 基于全卷积网络的声矢量传感器DOA估计

2021 IEEE Workshop on Signal Processing Systems (SiPS) Pub Date : 2021-10-01 DOI: 10.1109/SiPS52927.2021.00014

Sifan Wang, J. Geng, Xin Lou

引用次数: 1

Fault-Tolerance of Binarized and Stochastic Computing-based Neural Networks 基于二值化和随机计算的神经网络容错性

2021 IEEE Workshop on Signal Processing Systems (SiPS) Pub Date : 2021-10-01 DOI: 10.1109/SiPS52927.2021.00018

Amir Ardakani, A. Ardakani, W. Gross

引用次数: 1

Hartley Stochastic Computing For Convolutional Neural Networks 卷积神经网络的Hartley随机计算

2021 IEEE Workshop on Signal Processing Systems (SiPS) Pub Date : 2021-10-01 DOI: 10.1109/SiPS52927.2021.00049

S. H. Mozafari, J. Clark, W. Gross, B. Meyer

{"title":"Hartley Stochastic Computing For Convolutional Neural Networks","authors":"S. H. Mozafari, J. Clark, W. Gross, B. Meyer","doi":"10.1109/SiPS52927.2021.00049","DOIUrl":"https://doi.org/10.1109/SiPS52927.2021.00049","url":null,"abstract":"Energy consumption and the latency of convolutional neural networks (CNNs) are two important factors that limit their applications specifically for embedded devices. Fourier-based frequency domain (FD) convolution is a promising low-cost alter-native to conventional implementations in the spatial domain (SD) for CNNs. FD convolution performs its operation with point-wise multiplications. However, in CNNs, the overhead for the Fourier-based FD-convolution surpasses its computational saving for small filter sizes. In this work, we propose to implement convolutional layers in the FD using the Hartley transformation (HT) instead of the Fourier transformation. We show that the HT can reduce the convolution delay and energy consumption even for small filters. With the HT of parameters, we replace convolution with point-wise multiplications. HT lets us compress input feature maps, in all convolutional layer, before convolving them with filters. To optimize the hardware implementation of our method, we utilize stochastic computing (SC) to perform the point-wise multiplications in the FD. In this regard, we re-formalize the HT to better match with SC. We show that, compared to conventional Fourier-based convolution, Hartley SC-based convolution can achieve 1.33x speedup, and 1.23x energy saving on a Virtex 7 FPGA when we implement AlexNet over CIFAR-10.","PeriodicalId":103894,"journal":{"name":"2021 IEEE Workshop on Signal Processing Systems (SiPS)","volume":"11 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2021-10-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"127675701","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 1

A Novel Blind Detection Method and FPGA Implementation for Energy-Efficient Sidelink Communications 一种新的节能旁链路通信盲检测方法及FPGA实现

2021 IEEE Workshop on Signal Processing Systems (SiPS) Pub Date : 2021-10-01 DOI: 10.1109/SiPS52927.2021.00010

Chenhao Zhang, Haiqin Hu, Shan Cao, Zhiyuan Jiang

引用次数: 1