2019 IEEE International Workshop on Signal Processing Systems (SiPS)最新文献

筛选
英文 中文
Lattice-Reduction-Aided Symbol-Wise Intra-Iterative Interference Cancellation Detector for Massive MIMO System 大规模MIMO系统的格约简辅助符号迭代内干扰消除检测器
2019 IEEE International Workshop on Signal Processing Systems (SiPS) Pub Date : 2019-10-01 DOI: 10.1109/SiPS47522.2019.9020430
Hsiao-Yu Yeh, Yuan-Hao Huang
{"title":"Lattice-Reduction-Aided Symbol-Wise Intra-Iterative Interference Cancellation Detector for Massive MIMO System","authors":"Hsiao-Yu Yeh, Yuan-Hao Huang","doi":"10.1109/SiPS47522.2019.9020430","DOIUrl":"https://doi.org/10.1109/SiPS47522.2019.9020430","url":null,"abstract":"Massive multiple-input multiple-output (MIMO) system plays an important role of increasing spectral efficiency in the fifth-generation (5G) cellular communication. The MIMO detection complexity increases significantly along with the number of antennas. Thus, the design of high-performance low-complexity detector for massive MIMO is a challenging design issue for the 5G system. This paper proposes a lattice-reduction-aided (LRA) symbol-wise (SW) detection technique to enhance the performance of the intra-iterative interference cancellation (IIC) detector based on Newton’s method. The proposed SW IIC detector has near minimum-mean-square-error performance with faster convergence speed and lower computational complexity than the original IIC detector. In a 64-QAM $128 times 8$ up-link MIMO system, the proposed LRA SW IIC detector reduces about 95.35% computational complexity of the original IIC detector under the same BER performance. Considering the preprocessing complexity of the LR in the time-varying channel, the proposed LRA SW IIC detector still has lower complexity when the coherent frame size is larger than 12 MIMO symbols.","PeriodicalId":256971,"journal":{"name":"2019 IEEE International Workshop on Signal Processing Systems (SiPS)","volume":"30 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2019-10-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"128393962","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Design and Evaluation of a Power-Efficient Approximate Systolic Array Architecture for Matrix Multiplication 一种用于矩阵乘法的低功耗近似收缩阵列架构的设计与评估
2019 IEEE International Workshop on Signal Processing Systems (SiPS) Pub Date : 2019-10-01 DOI: 10.1109/SiPS47522.2019.9020404
Haroon Waris, Chenghua Wang, Weiqiang Liu, F. Lombardi
{"title":"Design and Evaluation of a Power-Efficient Approximate Systolic Array Architecture for Matrix Multiplication","authors":"Haroon Waris, Chenghua Wang, Weiqiang Liu, F. Lombardi","doi":"10.1109/SiPS47522.2019.9020404","DOIUrl":"https://doi.org/10.1109/SiPS47522.2019.9020404","url":null,"abstract":"Matrix multiplication (MM) is a basic operation for many Digital Signal Processing applications. A Systolic Array (SA) is often considered as one of the most favorable architecture to achieve high performance for matrix multiplication. In this paper, the design exploration for an approximate SA is pursued; three design schemes are proposed by introducing approximation in multiple sub-modules. An approximation factor $alpha$ is introduced; it is related to the inexact columns in the SA to explore the accuracy-efficiency trade-off present in the proposed designs. In the evaluation, an 8-bit input operand matrix multiplication is considered; the Synopsys Design Compiler at 45nm technology node is used to establish hardware-related metrics. The Error Rate (ER), Normalized Mean Error Distance (NMED) and Mean Relative Error Distance (MRED) are used as figures of merit for error analysis. Results show that the proposed architecture for 8-bit matrix multiplication with an approximation factor $alpha=7$ has the lower power consumption compared to existing inexact designs found in the technical literature with comparable NMED. In addition, a power delay product vs NMED analysis shows the proposed designs have a lower PDP so applicable to low power applications. The practicality of the proposed architecture is established by computing the Discrete Cosine Transform.","PeriodicalId":256971,"journal":{"name":"2019 IEEE International Workshop on Signal Processing Systems (SiPS)","volume":"64 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2019-10-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"129493152","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 5
AVX-512 Based Software Decoding for 5G LDPC Codes 基于AVX-512的5G LDPC码软件解码
2019 IEEE International Workshop on Signal Processing Systems (SiPS) Pub Date : 2019-10-01 DOI: 10.1109/SiPS47522.2019.9020587
Yi Xu, Wen Wang, Z. Xu, Xiqi Gao
{"title":"AVX-512 Based Software Decoding for 5G LDPC Codes","authors":"Yi Xu, Wen Wang, Z. Xu, Xiqi Gao","doi":"10.1109/SiPS47522.2019.9020587","DOIUrl":"https://doi.org/10.1109/SiPS47522.2019.9020587","url":null,"abstract":"In this paper, we investigate how the 5G NR LDPC codes can be decoded by GPP effectively with single instruction-multiple-data (SIMD) acceleration and evaluate the corresponding achievable throughput on newly released Intel Xeon CPUs. Firstly, a general software implementation architecture with SIMD acceleration for horizontal-layered LDPC decoding is presented, where the parallelism can be achieved in an intra-block manner. By utilizing Intel advanced vector extended 512 (AVX-512) instruction set, the efficiency of parallelism are maximized and therefore the capacity of x86 processors can be fully exploited. In addition, new features of AVX-512 are further exploited to optimize load and store operations as well as preprocessing to reduce the operation cost. Experiments results also show that Intel Xeon Gold 6154 processors can achieve 42 to 272 Mbps throughput with a single core for ten layered decoding iterations for various code rate and block length. The typical processing latency is below 100 $mu s$. Consequently, an 18-core Intel Xeon CPU can achieve up to 5 Gbps decoding throughput.","PeriodicalId":256971,"journal":{"name":"2019 IEEE International Workshop on Signal Processing Systems (SiPS)","volume":"161 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2019-10-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"115265136","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 3
A Unified and Flexible Eigen-Solver for Rank-Deficient Matrix in MIMO Precoding/Beamforming Applications MIMO预编码/波束形成中秩缺失矩阵的统一灵活特征求解器
2019 IEEE International Workshop on Signal Processing Systems (SiPS) Pub Date : 2019-10-01 DOI: 10.1109/SiPS47522.2019.9020368
Su-An Chou, A. E. Rakhmania, P. Tsai
{"title":"A Unified and Flexible Eigen-Solver for Rank-Deficient Matrix in MIMO Precoding/Beamforming Applications","authors":"Su-An Chou, A. E. Rakhmania, P. Tsai","doi":"10.1109/SiPS47522.2019.9020368","DOIUrl":"https://doi.org/10.1109/SiPS47522.2019.9020368","url":null,"abstract":"Eigenvalue decomposition (EVD) is a widely adopted technique to separate signal, interference, and noise subspaces. The paper presents a unified eigen-solver based on QR decomposition (QRD) to generate eigenpairs associated with the largest eigenvalues or zero eigenvalues, which are required in the MIMO hybrid beamforming systems that need interference suppression. A non-uniformly constrained deflation is proposed, which forces the matrix to deflate in the beginning and efficiently allocates the computation power to the eigenpairs related with the largest eigenvalues. The computation complexity of generating interested eigenpairs is also evaluated for various matrix dimensions. The results demonstrate that the non-uniformly constrained deflation is effective and more computations can be saved if the desired number of eigenpairs is smaller than the rank of the matrix.","PeriodicalId":256971,"journal":{"name":"2019 IEEE International Workshop on Signal Processing Systems (SiPS)","volume":"28 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2019-10-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"126381815","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
A Distributed Detection Algorithm For Uplink Massive MIMO Systems 一种用于上行海量MIMO系统的分布式检测算法
2019 IEEE International Workshop on Signal Processing Systems (SiPS) Pub Date : 2019-10-01 DOI: 10.1109/SiPS47522.2019.9020489
Qiufeng Liu, Hao Liu, Ying Yan, Peng Wu
{"title":"A Distributed Detection Algorithm For Uplink Massive MIMO Systems","authors":"Qiufeng Liu, Hao Liu, Ying Yan, Peng Wu","doi":"10.1109/SiPS47522.2019.9020489","DOIUrl":"https://doi.org/10.1109/SiPS47522.2019.9020489","url":null,"abstract":"Massive multiple-input multiple-output (MIMO) uplink detection algorithms usually rely on centralized base station (BS) architecture, which results in excessive amount of raw baseband data to be transmitted to central processing unit (CU) when the number of antennas is large. Considering the channel hardening characteristics occurs in massive MIMO channels, this paper develops a novel distributed algorithm based on a daisy chain architecture, where the BS antennas are divided into clusters and each owns independent computing hardware for signal processing. In distributed signal detection, only local channel state information (CSI), received data and some data exchange between clusters are needed on each cluster. It is demonstrated that the algorithm can achieve the tradeoff between complexity and performance better than other existing distributed methods.","PeriodicalId":256971,"journal":{"name":"2019 IEEE International Workshop on Signal Processing Systems (SiPS)","volume":"20 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2019-10-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"129585443","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 3
An ISAR Imaging Algorithm Based on RCA for Micro-Doppler Effect Suppression 一种基于RCA的ISAR成像微多普勒抑制算法
2019 IEEE International Workshop on Signal Processing Systems (SiPS) Pub Date : 2019-10-01 DOI: 10.1109/SiPS47522.2019.9020383
Xinbo Xu, Xinfei Jin, Fulin Su
{"title":"An ISAR Imaging Algorithm Based on RCA for Micro-Doppler Effect Suppression","authors":"Xinbo Xu, Xinfei Jin, Fulin Su","doi":"10.1109/SiPS47522.2019.9020383","DOIUrl":"https://doi.org/10.1109/SiPS47522.2019.9020383","url":null,"abstract":"In Inverse Synthetic Aperture Radar (ISAR) imaging, the micro-Doppler (m-D) effect caused by micro-motion parts of the target will not only make parameter extraction and motion compensation difficult but also cause image defocusing. It will appear as azimuth interference sidebands and decrease image quality seriously. Therefore, studying the micro-Doppler suppression problem in practical applications is of great importance in high-quality imaging of ISAR. In this paper, a reasonable and effective mathematical model is established, and the m-D suppression algorithm inspired by the robust principal component analysis (RPCA) matrix reconstruction theory is proposed. Our algorithm transforms the problem of separating radar echoes into the decomposition of a low rank rotating components m-D signal matrix and a sparse main body ISAR image signal matrix. Moreover, experimental results based on simulated and real measured data are utilized to verify the effectiveness of our method.","PeriodicalId":256971,"journal":{"name":"2019 IEEE International Workshop on Signal Processing Systems (SiPS)","volume":"44 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2019-10-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"126853050","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 1
A Low-Latency and Low-Complexity Hardware Architecture for CTC Beam Search Decoding 一种低延迟、低复杂度的CTC波束搜索解码硬件结构
2019 IEEE International Workshop on Signal Processing Systems (SiPS) Pub Date : 2019-10-01 DOI: 10.1109/SiPS47522.2019.9020324
Siyuan Lu, Jinming Lu, Jun Lin, Zhongfeng Wang, L. Du
{"title":"A Low-Latency and Low-Complexity Hardware Architecture for CTC Beam Search Decoding","authors":"Siyuan Lu, Jinming Lu, Jun Lin, Zhongfeng Wang, L. Du","doi":"10.1109/SiPS47522.2019.9020324","DOIUrl":"https://doi.org/10.1109/SiPS47522.2019.9020324","url":null,"abstract":"The recurrent neural networks (RNNs) along with connectionist temporal classification (CTC) have been widely used in many sequence to sequence tasks, including automatic speech recognition (ASR), lipreading, and scene text recognition (STR). In these systems, CTC-trained RNNs usually require specific CTC-decoders after their output layers. Many existing CTC-trained RNN inference systems use FPGA to do calculations of RNNs, and decode their outputs on CPU. However, with the development of FPGA-based RNN hardware accelerators, existing CPU-based CTC-decoder can not meet the latency requirement of them. To resolve this issue, this paper proposes an efficient hardware architecture for the CTC beam search decoder based on the decoding method reported in our previous work. The experimental results show that the system latency per sample of the CTC-decoder is only 7.19us on Xilinx xc7vx1140tflg19301 FPGA platform, which is lower than state-of-the-art RNNs. We also implement the origin algorithm on the same FPGA platform. Comparison results show that the improved one reduces the system latency per sample by 63.67%, the LUTRAMs by 97.44%, the FFs by 79.55%, and the DSPs by 50%. To the best of our knowledge, this is the first work on hardware implementation for CTC beam search decoder.","PeriodicalId":256971,"journal":{"name":"2019 IEEE International Workshop on Signal Processing Systems (SiPS)","volume":"5 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2019-10-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"128939027","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 1
[Copyright notice] (版权)
2019 IEEE International Workshop on Signal Processing Systems (SiPS) Pub Date : 2019-10-01 DOI: 10.1109/sips47522.2019.9020396
{"title":"[Copyright notice]","authors":"","doi":"10.1109/sips47522.2019.9020396","DOIUrl":"https://doi.org/10.1109/sips47522.2019.9020396","url":null,"abstract":"","PeriodicalId":256971,"journal":{"name":"2019 IEEE International Workshop on Signal Processing Systems (SiPS)","volume":"1 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2019-10-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"131617679","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
A New Inversionless Berlekamp-Massey Algorithm with Efficient Architecture 一种新的高效无反转Berlekamp-Massey算法
2019 IEEE International Workshop on Signal Processing Systems (SiPS) Pub Date : 2019-10-01 DOI: 10.1109/SiPS47522.2019.9020488
Chao Chen, Y. Han, Zhongfeng Wang, B. Bai
{"title":"A New Inversionless Berlekamp-Massey Algorithm with Efficient Architecture","authors":"Chao Chen, Y. Han, Zhongfeng Wang, B. Bai","doi":"10.1109/SiPS47522.2019.9020488","DOIUrl":"https://doi.org/10.1109/SiPS47522.2019.9020488","url":null,"abstract":"This paper presents a new inversionless Berlekamp-Massey (BM) algorithm as well as its efficient architecture. Starting with a lesser-known version of BM algorithm, we develop a serial of inversionless variants by successively applying algorithmic transformations. The final algorithm has a very compact description and a highly regular structure, which can be naturally mapped to a systolic architecture. Compared with the state-of-the-art architecture RiBM, the proposed one possesses a different cell structure and has slightly lower hardware requirements. More importantly, it enables us to establish a new architectural equivalence between the BM algorithm and the Euclidean algorithm.","PeriodicalId":256971,"journal":{"name":"2019 IEEE International Workshop on Signal Processing Systems (SiPS)","volume":"29 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2019-10-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"131682738","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 1
FPGA Prototyping of A Millimeter-Wave Multiple Gigabit WLAN System 毫米波多千兆无线局域网系统的FPGA原型设计
2019 IEEE International Workshop on Signal Processing Systems (SiPS) Pub Date : 2019-10-01 DOI: 10.1109/SiPS47522.2019.9020634
Dongming Ren, Kang Chen, Shengheng Liu, Yongming Huang
{"title":"FPGA Prototyping of A Millimeter-Wave Multiple Gigabit WLAN System","authors":"Dongming Ren, Kang Chen, Shengheng Liu, Yongming Huang","doi":"10.1109/SiPS47522.2019.9020634","DOIUrl":"https://doi.org/10.1109/SiPS47522.2019.9020634","url":null,"abstract":"IEEE 802.11aj (45-GHz) standard is recently proposed for wireless local area network operating in an undefined millimeter-wave (mmWave) band. In this work, an ultra-high-speed mmWave orthogonal frequency division multiplexing transmission prototype is developed and some primary amendments in this standard are verified using NI-PXIe mmWave softwaredefined-radio platform. A mixed parallel processing scheme is devised to meet the clock requirements of field programmable gate arrays baseband processing. A queue-based synchronization mechanism is designed to facilitate the implementation of data transporting. Data transmission test indicates that the system is able to achieve an extremely high data rate of multi-gigabits per second with a low bit error rate.","PeriodicalId":256971,"journal":{"name":"2019 IEEE International Workshop on Signal Processing Systems (SiPS)","volume":"27 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2019-10-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"133917498","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 1
0
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
确定
请完成安全验证×
相关产品
×
本文献相关产品
联系我们:info@booksci.cn Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。 Copyright © 2023 布克学术 All rights reserved.
京ICP备2023020795号-1
ghs 京公网安备 11010802042870号
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术官方微信