2019 IEEE International Workshop on Signal Processing Systems (SiPS)最新文献

筛选
英文 中文
A Hybrid GPU + FPGA System Design for Autonomous Driving Cars 面向自动驾驶汽车的GPU + FPGA混合系统设计
2019 IEEE International Workshop on Signal Processing Systems (SiPS) Pub Date : 2019-10-01 DOI: 10.1109/SiPS47522.2019.9020540
Cong Hao, Junli Gu, Deming Chen, A. Sarwari, Zhijie Jin, Husam Abu-Haimed, Daryl Sew, Yuhong Li, Xinheng Liu, Bryan Wu, Dongdong Fu
{"title":"A Hybrid GPU + FPGA System Design for Autonomous Driving Cars","authors":"Cong Hao, Junli Gu, Deming Chen, A. Sarwari, Zhijie Jin, Husam Abu-Haimed, Daryl Sew, Yuhong Li, Xinheng Liu, Bryan Wu, Dongdong Fu","doi":"10.1109/SiPS47522.2019.9020540","DOIUrl":"https://doi.org/10.1109/SiPS47522.2019.9020540","url":null,"abstract":"Autonomous driving cars need highly complex hardware and software systems, which require high performance computing platforms in order to enable a real time AI-based perception and decision making pipeline. The industry has been exploring various in-vehicle accelerators such as GPUs, ASICs and FPGAs. Yet the autonomous driving platform design is far from mature when taking into account of system reliability, redundancy and higher level of autonomy. In this work, we propose a hybrid computing system design, which integrates a GPU as the primary computing system and a FPGA as a secondary system. This hybrid system architecture has multiple advantages: 1) The FPGA can be constantly running as a complementary system with very short latency, helping to detect main system failure and anomalous behavior, contributing to system functionality verification and reliability. 2) If the primary system fails (mostly from sensor or interconnection error), the FPGA will quickly detect the failure and run a safe-mode task with a subset of sensors. 3) The FPGA can be used as an independent computing system to run extra algorithm components to improve the overall system autonomy. For example, FPGA can handle driver monitoring tasks while GPU focuses on driving functions. Together they can boost the driving function from L2 (constantly requires driver’s attention) to L3 (allows driver to mind off for 10 seconds). This paper defines how such a system works, discusses various use cases and potential design challenges, and shares some initial results and insights about how to make such a system deliver the maximum value for autonomous driving.","PeriodicalId":256971,"journal":{"name":"2019 IEEE International Workshop on Signal Processing Systems (SiPS)","volume":"67 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2019-10-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"122560559","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 12
Modified Complementary Joint Sparse Representations: A Novel Post-Filtering to MVDR Beamforming 改进互补联合稀疏表示:一种新的MVDR波束形成后滤波方法
2019 IEEE International Workshop on Signal Processing Systems (SiPS) Pub Date : 2019-10-01 DOI: 10.1109/SiPS47522.2019.9020522
Yuanyuan Zhu, Jiafei Fu, Xu Xu, Z. Ye
{"title":"Modified Complementary Joint Sparse Representations: A Novel Post-Filtering to MVDR Beamforming","authors":"Yuanyuan Zhu, Jiafei Fu, Xu Xu, Z. Ye","doi":"10.1109/SiPS47522.2019.9020522","DOIUrl":"https://doi.org/10.1109/SiPS47522.2019.9020522","url":null,"abstract":"Post-filtering is a popular technique for multichannel speech enhancement system, in order to further improve the speech quality and intelligibility after beamforming. This paper presents a novel post-filtering to a minimum variance distortionless response (MVDR) beamforming which is a single-channel modified complementary joint sparse representations (M-CJSR) method. First, MVDR beamformer is used to suppress interference and noise. Subsequently, the proposed M-CJSR approach based on joint dictionary learning is applied as a single microphone post-filter to process the beamformer output. Different from the existing post-filtering techniques which rely on the assumptions about the noise field, this algorithm considers a more generalized signal model including the ambient noise, like diffuse noise or white noise, as well as the point-source interference. Moreover, the original CJSR method is extended to jointly learn dictionaries for not only the mappings from mixture to speech and noise, but also the mapping from mixture to interference. In order to take the complementary advantages of different sparse representations, we design the weighting parameters based on the residual components of the estimated signals. An experimental study which consists of objective evaluations under various conditions verifies the superiority of the proposed algorithm compared to other state-of-the-art methods.","PeriodicalId":256971,"journal":{"name":"2019 IEEE International Workshop on Signal Processing Systems (SiPS)","volume":"8 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2019-10-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"128079066","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 3
SiPS 2019 Conference Committee SiPS 2019会议委员会
2019 IEEE International Workshop on Signal Processing Systems (SiPS) Pub Date : 2019-10-01 DOI: 10.1109/sips47522.2019.9020636
{"title":"SiPS 2019 Conference Committee","authors":"","doi":"10.1109/sips47522.2019.9020636","DOIUrl":"https://doi.org/10.1109/sips47522.2019.9020636","url":null,"abstract":"","PeriodicalId":256971,"journal":{"name":"2019 IEEE International Workshop on Signal Processing Systems (SiPS)","volume":"19 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2019-10-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"128186671","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
An Efficient Polynomial Multiplier Architecture for the Bootstrapping Algorithm in a Fully Homomorphic Encryption Scheme 全同态加密方案中自举算法的高效多项式乘法器结构
2019 IEEE International Workshop on Signal Processing Systems (SiPS) Pub Date : 2019-10-01 DOI: 10.1109/SiPS47522.2019.9020592
Weihang Tan, Aengran Au, Benjamin Aase, S. Aao, Yingjie Lao
{"title":"An Efficient Polynomial Multiplier Architecture for the Bootstrapping Algorithm in a Fully Homomorphic Encryption Scheme","authors":"Weihang Tan, Aengran Au, Benjamin Aase, S. Aao, Yingjie Lao","doi":"10.1109/SiPS47522.2019.9020592","DOIUrl":"https://doi.org/10.1109/SiPS47522.2019.9020592","url":null,"abstract":"Bootstrapping algorithm, which is the intermediate refreshing procedure of a processed ciphertext, has been the performance bottleneck among various existing Fully Homomorphic Encryption (FHE) schemes. Specifically, the external product of polynomials is the most computationally expensive step of bootstrapping algorithms that are based on the Ring Learning With Error (RLWE) problem. In this paper, we design a novel and scalable polynomial multiplier architecture for a bootstrapping algorithm along with a conflict-free memory management scheme to reduce the latency, while achieving a full utilization of the processing elements (PEs). Each PE is a modified radix-2 butterfly unit from fast Fourier transform (FFT), which can be reconfigured to use in both the number theoretic transform (NTT) and the basic modular multiplication of polynomial multiplication in the external product step. The experimental results show that our design yields 33% less area-time product than prior designs.","PeriodicalId":256971,"journal":{"name":"2019 IEEE International Workshop on Signal Processing Systems (SiPS)","volume":"72 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2019-10-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"115884045","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 4
Improving Reliability of ReRAM-Based DNN Implementation through Novel Weight Distribution 通过新的权值分布提高基于reram的深度神经网络实现的可靠性
2019 IEEE International Workshop on Signal Processing Systems (SiPS) Pub Date : 2019-10-01 DOI: 10.1109/SiPS47522.2019.9020318
Jingtao Li, Manqing Mao, C. Chakrabarti
{"title":"Improving Reliability of ReRAM-Based DNN Implementation through Novel Weight Distribution","authors":"Jingtao Li, Manqing Mao, C. Chakrabarti","doi":"10.1109/SiPS47522.2019.9020318","DOIUrl":"https://doi.org/10.1109/SiPS47522.2019.9020318","url":null,"abstract":"Binary deep neural networks, that have been implemented in resistive random access memory (ReRAM) for storage efficiency, suffer from poor recognition performance in the presence of hardware errors. This paper addresses this problem by deriving a novel weight distribution and representation scheme that mitigates errors due to faulty ReRAM cells with minimal storage overhead. In the proposed scheme, the weight matrix is partitioned into grains, and each weight in a grain is represented by the sum of a multi-bit mean and a 1-bit deviation. The grain size as well as the mean to deviation ratio of the weights in a grain can be chosen such that the network is resilient to hardware errors. A hybrid processing-in-memory (PIM) architecture is proposed to support this scheme. The mean values are stored in a small SRAM and processed by a CMOS unit, and the deviations are stored and processed by the ReRAM unit. Compared to the baseline binary neural network which fails in the presence of severe hardware errors, the proposed hybrid scheme has only a mild recognition performance degradation. Simulation results show the proposed scheme achieves 97.84% test accuracy (a 0.84% accuracy drop) on a MNIST dataset, and 88.07% test accuracy (a 1.10% accuracy drop) on a CIFAR-10 dataset under 9.04% stuck-at-1 and 1.75% stuck-at-0 faults.","PeriodicalId":256971,"journal":{"name":"2019 IEEE International Workshop on Signal Processing Systems (SiPS)","volume":"26 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2019-10-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"121669337","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 1
[SiPS 2019 Title Page] [SiPS 2019标题页]
2019 IEEE International Workshop on Signal Processing Systems (SiPS) Pub Date : 2019-10-01 DOI: 10.1109/sips47522.2019.9020313
{"title":"[SiPS 2019 Title Page]","authors":"","doi":"10.1109/sips47522.2019.9020313","DOIUrl":"https://doi.org/10.1109/sips47522.2019.9020313","url":null,"abstract":"","PeriodicalId":256971,"journal":{"name":"2019 IEEE International Workshop on Signal Processing Systems (SiPS)","volume":"5 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2019-10-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"125262801","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Pilot-Assisted Methods for Channel Estimation in MIMO-V-OFDM Systems MIMO-V-OFDM系统中导频辅助信道估计方法
2019 IEEE International Workshop on Signal Processing Systems (SiPS) Pub Date : 2019-10-01 DOI: 10.1109/SiPS47522.2019.9020482
Wei Zhang, Xuyang Gao, Yibing Shi
{"title":"Pilot-Assisted Methods for Channel Estimation in MIMO-V-OFDM Systems","authors":"Wei Zhang, Xuyang Gao, Yibing Shi","doi":"10.1109/SiPS47522.2019.9020482","DOIUrl":"https://doi.org/10.1109/SiPS47522.2019.9020482","url":null,"abstract":"Multiple-input multiple-output (MIMO) with Orthogonal Frequency Division Multiplexing (OFDM) technology has both the advantages of MIMO and OFDM. Vector Orthogonal Frequency Division Multiplexing (V-OFDM) is an extension of OFDM, which makes data transmission flexible. In MIMO systems using V-OFDM technology, different novel schemes are proposed to improve channel estimation performance for different channel sparsity. The 2-D Kriging interpolation scheme is proposed for the non-sparse channels, which can significantly improve the performance of conventional Least Square (LS) and Minimum Mean Square Error (MMSE) algorithms. When the channel is sparse, the estimation process can be modeled as a sparse recovery problem using compressed sensing (CS) theory. In this paper, the measurement matrix is determined by pilot locations, and a pilot search algorithm based on random genetic algorithm (RGA) is proposed to minimize the cross-correlation value of the measurement matrix. Besides, a variable threshold sparsity adaptive matching pursuit (VTSAMP) algorithm is designed to obtain more accurate estimates, which achieves better Normalized Mean Square Error (NMSE) performance, higher calculation speed, and lower implementation complexity.","PeriodicalId":256971,"journal":{"name":"2019 IEEE International Workshop on Signal Processing Systems (SiPS)","volume":"49 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2019-10-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"130069349","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 1
A Survey of Computation-Driven Data Encoding 计算驱动数据编码研究综述
2019 IEEE International Workshop on Signal Processing Systems (SiPS) Pub Date : 2019-10-01 DOI: 10.1109/SiPS47522.2019.9020519
Weikang Qian, Runsheng Wang, Yuan Wang, Marc D. Riedel, Ru Huang
{"title":"A Survey of Computation-Driven Data Encoding","authors":"Weikang Qian, Runsheng Wang, Yuan Wang, Marc D. Riedel, Ru Huang","doi":"10.1109/SiPS47522.2019.9020519","DOIUrl":"https://doi.org/10.1109/SiPS47522.2019.9020519","url":null,"abstract":"Although the metal-oxide-semiconductor field-effect transistor (MOSFET) has been the dominant device for modern very-large scale integration (VLSI) circuits for more than six decades, with the dawning of a post-Moore era, researchers are trying to find replacements. A foundation of modern digital computing is the encoding of digital values through a binary radix representation. However, as we enter into the post-Moore era, the challenges of increasing power density, signal noise, and device unreliability raise the question of whether this basic way of encoding data is still the best choice, particularly with novel electronic devices. Prior work has shown that binary radix encoding has some disadvantages. We argue that it is crucial to rethink the necessity of using this representation in the post-Moore era. In this paper, we review some recent development on computation-driven data encoding. We begin with stochastic encoding, a representation proposed a long time ago, discussing both its advantages and disadvantages. Then, we review several recent breakthroughs with variations of stochastic encoding that mitigate many of its disadvantages. Finally, we conclude the paper by extrapolating future directions for effective computation-driven data encoding.","PeriodicalId":256971,"journal":{"name":"2019 IEEE International Workshop on Signal Processing Systems (SiPS)","volume":"2015 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2019-10-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"127674322","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 4
Towards Algebraic Modeling of GPU Memory Access for Bank Conflict Mitigation 面向库冲突缓解的GPU内存访问代数建模
2019 IEEE International Workshop on Signal Processing Systems (SiPS) Pub Date : 2019-10-01 DOI: 10.1109/SiPS47522.2019.9020385
Luca Ferranti, J. Boutellier
{"title":"Towards Algebraic Modeling of GPU Memory Access for Bank Conflict Mitigation","authors":"Luca Ferranti, J. Boutellier","doi":"10.1109/SiPS47522.2019.9020385","DOIUrl":"https://doi.org/10.1109/SiPS47522.2019.9020385","url":null,"abstract":"Graphics Processing Units (GPU) have been widely used in various fields of scientific computing, such as in signal processing. GPUs have a hierarchical memory structure with memory layers that are shared between GPU processing elements. Partly due to the complex memory hierarchy, GPU programming is non-trivial, and several aspects must be taken into account, one being memory access patterns. One of the fastest GPU memory layers, shared memory, is grouped into banks to enable fast, parallel access for processing elements. Unfortunately, it may happen that multiple threads of a GPU program may access the same shared memory bank simultaneously causing a bank conflict. If this happens, program execution slows down as memory accesses have to be rescheduled to determine which instruction to execute first. Bank conflicts are not taken into account automatically by the compiler, and hence the programmer must detect and deal with them prior to program execution. In this paper, we present an algebraic approach to detect bank conflicts and prove some theoretical results that can be used to predict when bank conflicts happen and how to avoid them. Also, our experimental results illustrate the savings in computation time.","PeriodicalId":256971,"journal":{"name":"2019 IEEE International Workshop on Signal Processing Systems (SiPS)","volume":"18 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2019-10-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"129382316","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
AdaBoost-assisted Extreme Learning Machine for Efficient Online Sequential Classification adaboost辅助的高效在线顺序分类极限学习机
2019 IEEE International Workshop on Signal Processing Systems (SiPS) Pub Date : 2019-09-16 DOI: 10.1109/SiPS47522.2019.9020609
Yi-Ta Chen, Yu-Chuan Chuang, A. Wu
{"title":"AdaBoost-assisted Extreme Learning Machine for Efficient Online Sequential Classification","authors":"Yi-Ta Chen, Yu-Chuan Chuang, A. Wu","doi":"10.1109/SiPS47522.2019.9020609","DOIUrl":"https://doi.org/10.1109/SiPS47522.2019.9020609","url":null,"abstract":"In this paper, we propose an AdaBoost-assisted extreme learning machine for efficient online sequential classification (AOS-ELM). In order to achieve better accuracy in online sequential learning scenarios, we utilize the cost-sensitive algorithm-AdaBoost, which diversifying the weak classifiers, and adding the forgetting mechanism, which stabilizing the performance during the training procedure. Hence, AOS-ELM adapts better to sequentially arrived data compared with other voting based methods. The experiment results show AOS-ELM can achieve 94.41% accuracy on MNIST dataset, which is the theoretical accuracy bound performed by original batch learning algorithm, AdaBoost-ELM. Moreover, with the forgetting mechanism, the standard deviation of accuracy during the online sequential learning process is reduced to 8.26x.","PeriodicalId":256971,"journal":{"name":"2019 IEEE International Workshop on Signal Processing Systems (SiPS)","volume":"67 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2019-09-16","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"115714826","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 1
0
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
确定
请完成安全验证×
相关产品
×
本文献相关产品
联系我们:info@booksci.cn Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。 Copyright © 2023 布克学术 All rights reserved.
京ICP备2023020795号-1
ghs 京公网安备 11010802042870号
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术官方微信