2021 IEEE Workshop on Signal Processing Systems (SiPS)最新文献

筛选
英文 中文
A Memory-Efficient Hardware Architecture for Deformable Convolutional Networks 一种可变形卷积网络的内存高效硬件架构
2021 IEEE Workshop on Signal Processing Systems (SiPS) Pub Date : 2021-10-01 DOI: 10.1109/SiPS52927.2021.00033
Yue Yu, Jiapeng Luo, W. Mao, Zhongfeng Wang
{"title":"A Memory-Efficient Hardware Architecture for Deformable Convolutional Networks","authors":"Yue Yu, Jiapeng Luo, W. Mao, Zhongfeng Wang","doi":"10.1109/SiPS52927.2021.00033","DOIUrl":"https://doi.org/10.1109/SiPS52927.2021.00033","url":null,"abstract":"In recent years, deformable convolutional networks are widely adopted in object detection tasks and have achieved outstanding performance. Compared with conventional convolution, the deformable convolution has an irregular receptive field to adapt to objects with different sizes and shapes. However, the irregularity of the receptive field causes inefficient access to memory and increases the complexity of control logic. Toward hardware-friendly implementation, prior works change the characteristics of deformable convolution by restricting the receptive field, leading to accuracy degradation. In this paper, we develop a dedicated Sampling Core to sample and rearrange the input pixels, enabling the convolution array to access the inputs regularly. In addition, a memory-efficient dataflow is introduced to match the processing speed of the Sampling Core and convolutional array, which improves hardware utilization and reduces access to off-chip memory. Based on these optimizations, we propose a novel hardware architecture for the deformable convolution network, which is the first work to accelerate the original deformable convolution network. With the design of the memory-efficient architecture, the access to the off-chip memory is reduced significantly. We implement it on Xilinx Virtex-7 FPGA, and experiments show that the energy efficiency reaches 50.29 GOPS/W, which is 2.5 times higher compared with executing the same network on GPU.","PeriodicalId":103894,"journal":{"name":"2021 IEEE Workshop on Signal Processing Systems (SiPS)","volume":"1 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2021-10-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"130240482","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 5
Exploration of Energy-Efficient Architecture for Graph-Based Point-Cloud Deep Learning 基于图的点云深度学习节能架构探索
2021 IEEE Workshop on Signal Processing Systems (SiPS) Pub Date : 2021-10-01 DOI: 10.1109/SiPS52927.2021.00054
Jie-Fang Zhang, Zhengya Zhang
{"title":"Exploration of Energy-Efficient Architecture for Graph-Based Point-Cloud Deep Learning","authors":"Jie-Fang Zhang, Zhengya Zhang","doi":"10.1109/SiPS52927.2021.00054","DOIUrl":"https://doi.org/10.1109/SiPS52927.2021.00054","url":null,"abstract":"Deep learning on point clouds has attracted increasing attention in the fields of 3D computer vision and robotics. In particular, graph-based point-cloud deep neural networks (DNNs) have demonstrated promising performance in 3D object classification and scene segmentation tasks. However, the scattered and irregular graph-structured data in a graph-based point-cloud DNN cannot be computed efficiently by existing SIMD architectures and accelerators. Following a review of the challenges of point-cloud DNN and the key edge convolution operation, we provide several directions in optimizing the processing architecture, including computation model, data reuse, and data locality, for achieving an effective acceleration and an improved energy efficiency.","PeriodicalId":103894,"journal":{"name":"2021 IEEE Workshop on Signal Processing Systems (SiPS)","volume":"1 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2021-10-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"130252331","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 1
[Copyright notice] (版权)
2021 IEEE Workshop on Signal Processing Systems (SiPS) Pub Date : 2021-10-01 DOI: 10.1109/sips52927.2021.00003
{"title":"[Copyright notice]","authors":"","doi":"10.1109/sips52927.2021.00003","DOIUrl":"https://doi.org/10.1109/sips52927.2021.00003","url":null,"abstract":"","PeriodicalId":103894,"journal":{"name":"2021 IEEE Workshop on Signal Processing Systems (SiPS)","volume":"19 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2021-10-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"130420955","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
A Scalable Generator for Massive MIMO Baseband Processing Systems with Beamspace Channel Estimation 基于波束空间信道估计的大规模MIMO基带处理系统的可扩展发生器
2021 IEEE Workshop on Signal Processing Systems (SiPS) Pub Date : 2021-10-01 DOI: 10.1109/SiPS52927.2021.00040
Yue Dai, Harrison Liew, M. Rasekh, Seyed Hadi Mirfarshbafan, Alexandra Gallyas-Sanhueza, James Dunn, Upamanyu Madhow, Christoph Studer, B. Nikolić
{"title":"A Scalable Generator for Massive MIMO Baseband Processing Systems with Beamspace Channel Estimation","authors":"Yue Dai, Harrison Liew, M. Rasekh, Seyed Hadi Mirfarshbafan, Alexandra Gallyas-Sanhueza, James Dunn, Upamanyu Madhow, Christoph Studer, B. Nikolić","doi":"10.1109/SiPS52927.2021.00040","DOIUrl":"https://doi.org/10.1109/SiPS52927.2021.00040","url":null,"abstract":"This paper describes a scalable, highly portable, and energy-efficient generator for massive multiple-input multiple-output (MIMO) baseband processing systems. This generator is written in Chisel and produces hardware instances for a scalable massive MIMO system employing distributed processing. The generator is parameterized in both the MIMO system and hardware datapath elements. Coupled with a Python-based system simulator, the generator can be adapted to implement other baseband processing algorithms. To demonstrate the adaptability, several generator instances with different parameter values are evaluated by FPGA emulation. In addition, a beamspace calibration and channel denoising algorithm are applied to further improve the channel estimation performance. With those algorithms, the error vector magnitude can be reduced by up 9.2%.","PeriodicalId":103894,"journal":{"name":"2021 IEEE Workshop on Signal Processing Systems (SiPS)","volume":"5 1 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2021-10-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"130096691","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 1
Automatic Generation of Dynamic Inference Architecture for Deep Neural Networks 深度神经网络动态推理体系结构的自动生成
2021 IEEE Workshop on Signal Processing Systems (SiPS) Pub Date : 2021-10-01 DOI: 10.1109/SiPS52927.2021.00029
Shize Zhao, Liulu He, Xiaoru Xie, Jun Lin, Zhongfeng Wang
{"title":"Automatic Generation of Dynamic Inference Architecture for Deep Neural Networks","authors":"Shize Zhao, Liulu He, Xiaoru Xie, Jun Lin, Zhongfeng Wang","doi":"10.1109/SiPS52927.2021.00029","DOIUrl":"https://doi.org/10.1109/SiPS52927.2021.00029","url":null,"abstract":"The computational cost of deep neural network(DNN) model can be reduced dramatically by applying different architectures based on the difficulties of each sample, which is named dynamic inference tech. Manually designed dynamic inference framework is hard to be optimal for the dependency on human experience, which is also time-consuming and labor-intensive. In this paper, we provide an auto-designed AB-Net based on the popular dynamic framework BranchyNet, which is inspired by neural architecture search (NAS). To further accelerate the search procedure, we also develop several specific techniques. Firstly, the search space is optimized by the pre-selection of candidate architectures. Then, a neighborhood greedy search algorithm is developed to efficiently find the optimal architecture in the improved search space. Moreover, our scheme can be extended to the multiple-branch cases to further enhance the performance of the AB-Net. We apply the AB-Net on multiple mainstream models and evaluate them on datasets CIFAR10/100. Compared to the handcrafted BranchyNet, the proposed AB-Net is able to achieve 1.57× computational cost reduction at least even with slight accuracy improvement on CIFAR100. Moreover, the AB-Net also significantly outperforms the S2DNAS on accuracy with similar cost reduction, which is the state-of-the-art automatic dynamic inference architecture.","PeriodicalId":103894,"journal":{"name":"2021 IEEE Workshop on Signal Processing Systems (SiPS)","volume":"46 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2021-10-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"129547743","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Reconfigurable Neural Synaptic Plasticity-Based Stochastic Deep Neural Network Computing 基于可重构神经突触可塑性的随机深度神经网络计算
2021 IEEE Workshop on Signal Processing Systems (SiPS) Pub Date : 2021-10-01 DOI: 10.1109/SiPS52927.2021.00048
Zihan Xia, Ya Dong, Jienan Chen, Rui Wan, Shuai Li, Tingyong Wu
{"title":"Reconfigurable Neural Synaptic Plasticity-Based Stochastic Deep Neural Network Computing","authors":"Zihan Xia, Ya Dong, Jienan Chen, Rui Wan, Shuai Li, Tingyong Wu","doi":"10.1109/SiPS52927.2021.00048","DOIUrl":"https://doi.org/10.1109/SiPS52927.2021.00048","url":null,"abstract":"With the increasing popularity of deep neural networks (DNNs), a large amount of research effort has been devoted to the hardware acceleration of DNNs to achieve efficient processing. Nevertheless, few works have explored the similarities between the biological essence of DNNs and arithmetic circuits. Moreover, stochastic computing (SC), which implements complex arithmetic operations with simple logic gates, has been applied to the acceleration of DNNs. However, traditional SC suffers from high latency and large hardware cost of pseudo-random number generators (PRNGs). Inspired by neural synaptic plasticity and SC, in this work, we present the reconfigurable neural synaptic plasticity-based computing (RNSP) to mimic the biological neuron behaviors and exploit the parallelism of SC to the full extent while maintaining a small hardware footprint compared to fixed-point counterparts. RNSP converts fixed-point numbers to parallel bits without logic resources, which are then synthesized by bit-wise multiplications and some full adders. In addition, we propose the arithmetic unit based on RNSP and use re-training to mitigate the accuracy degradation. Finally, a convolution engine (CE) built on RNSP with high memory bandwidth efficiency is designed. According to the implementation results on FPGA, the proposed RNSP-based CE outperforms the fixed-point counterpart in terms of power consumption and area.","PeriodicalId":103894,"journal":{"name":"2021 IEEE Workshop on Signal Processing Systems (SiPS)","volume":"7 2-3","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2021-10-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"114046739","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Fully Convolutional Network-Based DOA Estimation with Acoustic Vector Sensor 基于全卷积网络的声矢量传感器DOA估计
2021 IEEE Workshop on Signal Processing Systems (SiPS) Pub Date : 2021-10-01 DOI: 10.1109/SiPS52927.2021.00014
Sifan Wang, J. Geng, Xin Lou
{"title":"Fully Convolutional Network-Based DOA Estimation with Acoustic Vector Sensor","authors":"Sifan Wang, J. Geng, Xin Lou","doi":"10.1109/SiPS52927.2021.00014","DOIUrl":"https://doi.org/10.1109/SiPS52927.2021.00014","url":null,"abstract":"In this paper, a learning-based direction of arrival (DOA) estimation pipeline for acoustic vector sensor (AVS) is proposed. In the proposed pipeline, a fully convolutional network (FCN) is introduced for uncontaminated time-frequency (TF) point extraction, which is a crucial step for AVS-based DOA estimation. Unlike conventional direct path dominant (DPD) or single source points (SSP) detection, the uncontaminated TF point extraction problem is modeled as an image segmentation problem, where the direct DOA cues from the spatial response of AVS is utilized for ground truth labeling to generate the training data of the network. With the extracted uncontaminated TF points, the final DOA can be generated using the proposed fuzzy geometric median (FGM) clustering. Simulation results show that the proposed pipeline is capable of improving the accuracy in the cases of small angular difference between acoustic sources and improving robustness in strong reverberation and noise situations.","PeriodicalId":103894,"journal":{"name":"2021 IEEE Workshop on Signal Processing Systems (SiPS)","volume":"292 4","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2021-10-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"131923444","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 1
Fault-Tolerance of Binarized and Stochastic Computing-based Neural Networks 基于二值化和随机计算的神经网络容错性
2021 IEEE Workshop on Signal Processing Systems (SiPS) Pub Date : 2021-10-01 DOI: 10.1109/SiPS52927.2021.00018
Amir Ardakani, A. Ardakani, W. Gross
{"title":"Fault-Tolerance of Binarized and Stochastic Computing-based Neural Networks","authors":"Amir Ardakani, A. Ardakani, W. Gross","doi":"10.1109/SiPS52927.2021.00018","DOIUrl":"https://doi.org/10.1109/SiPS52927.2021.00018","url":null,"abstract":"Both binarized and stochastic computing-based neural networks exploit bit-wise operations to replace expensive full-precision multiplications with simple XNOR gates and thus, offer low-cost hardware implementation. In stochastic computing, arithmetic computations are performed on sequences of random bits which can approximate any real values. Stochastic computing-based neural networks benefit from approximate computing and promote fault-tolerant architectures against soft errors in noisy environments. On the other hand, in binarized neural networks, real values are deterministically binarized using the sign function. As a result, any bit-flip in the binarized values dramatically changes the outcome of arithmetic computations and makes binarized neural networks more vulnerable against soft errors. In this paper, we compare these two neural networks against each other in terms of fault-tolerance and hardware complexity (i.e., area and energy efficiency).","PeriodicalId":103894,"journal":{"name":"2021 IEEE Workshop on Signal Processing Systems (SiPS)","volume":"99 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2021-10-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"116815597","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 1
Hartley Stochastic Computing For Convolutional Neural Networks 卷积神经网络的Hartley随机计算
2021 IEEE Workshop on Signal Processing Systems (SiPS) Pub Date : 2021-10-01 DOI: 10.1109/SiPS52927.2021.00049
S. H. Mozafari, J. Clark, W. Gross, B. Meyer
{"title":"Hartley Stochastic Computing For Convolutional Neural Networks","authors":"S. H. Mozafari, J. Clark, W. Gross, B. Meyer","doi":"10.1109/SiPS52927.2021.00049","DOIUrl":"https://doi.org/10.1109/SiPS52927.2021.00049","url":null,"abstract":"Energy consumption and the latency of convolutional neural networks (CNNs) are two important factors that limit their applications specifically for embedded devices. Fourier-based frequency domain (FD) convolution is a promising low-cost alter-native to conventional implementations in the spatial domain (SD) for CNNs. FD convolution performs its operation with point-wise multiplications. However, in CNNs, the overhead for the Fourier-based FD-convolution surpasses its computational saving for small filter sizes. In this work, we propose to implement convolutional layers in the FD using the Hartley transformation (HT) instead of the Fourier transformation. We show that the HT can reduce the convolution delay and energy consumption even for small filters. With the HT of parameters, we replace convolution with point-wise multiplications. HT lets us compress input feature maps, in all convolutional layer, before convolving them with filters. To optimize the hardware implementation of our method, we utilize stochastic computing (SC) to perform the point-wise multiplications in the FD. In this regard, we re-formalize the HT to better match with SC. We show that, compared to conventional Fourier-based convolution, Hartley SC-based convolution can achieve 1.33x speedup, and 1.23x energy saving on a Virtex 7 FPGA when we implement AlexNet over CIFAR-10.","PeriodicalId":103894,"journal":{"name":"2021 IEEE Workshop on Signal Processing Systems (SiPS)","volume":"11 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2021-10-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"127675701","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 1
A Novel Blind Detection Method and FPGA Implementation for Energy-Efficient Sidelink Communications 一种新的节能旁链路通信盲检测方法及FPGA实现
2021 IEEE Workshop on Signal Processing Systems (SiPS) Pub Date : 2021-10-01 DOI: 10.1109/SiPS52927.2021.00010
Chenhao Zhang, Haiqin Hu, Shan Cao, Zhiyuan Jiang
{"title":"A Novel Blind Detection Method and FPGA Implementation for Energy-Efficient Sidelink Communications","authors":"Chenhao Zhang, Haiqin Hu, Shan Cao, Zhiyuan Jiang","doi":"10.1109/SiPS52927.2021.00010","DOIUrl":"https://doi.org/10.1109/SiPS52927.2021.00010","url":null,"abstract":"A novel physical sidelink control channel (PSCCH) blind detection method based on demodulation reference signal (DMRS) detection is proposed for sidelink communications in cellular vehicular-to-everything (C-V2X). In the proposed method, the user equipment (UE) first performs coherent energy detection on the DMRS positions. According to the information of the time/frequency location where the DMRS is detected, the UE can adjust the decoding area to minimize unnecessary blind decoding attempts. Based on the proposed algorithm and the channel estimation method, a VLSI architecture of joint energy detection and channel estimation (JEC) is proposed. Reference implementation results for a Xilinx Virtex-7 FPGA show that our design can reduce hardware complexity and energy consumption.","PeriodicalId":103894,"journal":{"name":"2021 IEEE Workshop on Signal Processing Systems (SiPS)","volume":"10 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2021-10-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"131138738","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 1
0
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
确定
请完成安全验证×
相关产品
×
本文献相关产品
联系我们:info@booksci.cn Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。 Copyright © 2023 布克学术 All rights reserved.
京ICP备2023020795号-1
ghs 京公网安备 11010802042870号
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术官方微信