2008 IEEE Workshop on Signal Processing Systems最新文献_第5页

Traffic-balanced IP mapping algorithm for 2D-mesh On-Chip-Networks 二维网格片上网络流量均衡IP映射算法

2008 IEEE Workshop on Signal Processing Systems Pub Date : 2008-11-17 DOI: 10.1109/SIPS.2008.4671762

Ting-Jung Lin, Shu-Yen Lin, A. Wu

引用次数: 3

Cooperative OFDM for energy-efficient wireless sensor networks 高效节能无线传感器网络的协同OFDM

2008 IEEE Workshop on Signal Processing Systems Pub Date : 2008-11-17 DOI: 10.1109/SIPS.2008.4671741

Weiguo Tang, Lei Wang

引用次数: 17

Minimal complexity low-latency architectures for Viterbi decoders 维特比解码器的最小复杂度、低延迟架构

2008 IEEE Workshop on Signal Processing Systems Pub Date : 2008-11-17 DOI: 10.1109/SIPS.2008.4671752

Renfei Liu, K. Parhi

{"title":"Minimal complexity low-latency architectures for Viterbi decoders","authors":"Renfei Liu, K. Parhi","doi":"10.1109/SIPS.2008.4671752","DOIUrl":"https://doi.org/10.1109/SIPS.2008.4671752","url":null,"abstract":"For Viterbi decoders, high throughput rate is achieved by applying look-ahead techniques in the add-compare-select unit, which is the system speed bottleneck. Look-ahead techniques combine multiple binary trellis steps into one equivalent complex trellis step in time sequence, which is referred to as the branch metrics precomputation (BMP) unit. The complexity and latency of BMP increase exponentially and linearly with respect to the look-ahead levels, respectively. For a Viterbi decoder with constraint length K and M-step look-ahead, 2M+K-1 branch metrics need to be computed and compared. In this paper, the computational redundancy in existing branch metric computation approaches is first recognized, and a general mathematical model for describing the approach space is built, based on which a new approach with minimal complexity and latency is proposed. The proof of its optimality is also given. This highly efficient approach leads to a novel overall optimal architecture for M that is any multiple of K. The results show that the proposed approaches can reduce the complexity by up to 45.65% and the latency by up to 72.50%. In addition, the proposed architecture can also be applied when M is any value while achieving the minimal complexity.","PeriodicalId":173371,"journal":{"name":"2008 IEEE Workshop on Signal Processing Systems","volume":"27 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2008-11-17","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"126944104","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 5

Efficient interpolration architecture for soft-decision Reed-Solomon decoding by applying slow-down 采用慢速法实现软判决Reed-Solomon解码的高效插值结构

2008 IEEE Workshop on Signal Processing Systems Pub Date : 2008-11-17 DOI: 10.1109/SIPS.2008.4671731

Xinmiao Zhang, Jiangli Zhu

引用次数: 7

An implementation friendly low complexity multiplierless LLR generator for soft MIMO sphere decoders 一种实现友好的低复杂度无乘法器LLR发生器，用于软MIMO球面解码器

2008 IEEE Workshop on Signal Processing Systems Pub Date : 2008-11-17 DOI: 10.1109/SIPS.2008.4671748

Min Li, D. Novo, B. Bougard, F. Naessens, L. Perre, F. Catthoor

{"title":"An implementation friendly low complexity multiplierless LLR generator for soft MIMO sphere decoders","authors":"Min Li, D. Novo, B. Bougard, F. Naessens, L. Perre, F. Catthoor","doi":"10.1109/SIPS.2008.4671748","DOIUrl":"https://doi.org/10.1109/SIPS.2008.4671748","url":null,"abstract":"When combined with advanced FEC techniques such as the turbo code and LDPC code, soft-output MIMO sphere decoders significantly outperform hard-output sphere decoders. Hence, algorithms and implementations of soft-output sphere decoders have attracted intensive interest in recent years. Practical soft-output sphere decoder implementations often consist of a list generator and a LLR generator. Most existing implementations focus on the list generator, and the LLR generator is implemented in a relatively straightforward way. However, the LLR generator accounts for a great part of the complexity. Our contribution is an implementation friendly low complexity multiplierless LLR generator. We apply selective and incremental updating, algebraic simplifications and strength reductions to reduce the algorithmic complexity and to eliminate all multiplications. When integrated with the SSFE list generator, our scheme not only remove 100% multiplications, but also remove 26% to 83% additions, 76% to 94% bit-shifts and 63% to 91% memory operations. Besides the algorithmic aspects, we extract the key data-flow block with well-defined control signals. This can be easily mapped onto micro-architectures and implemented as the data-path in ASICs, or a function unit in ASIPs.","PeriodicalId":173371,"journal":{"name":"2008 IEEE Workshop on Signal Processing Systems","volume":"25 2 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2008-11-17","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"129856011","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 10

Kalman filtering based motion estimation for video coding with adaptive block partitioning 基于卡尔曼滤波的自适应块分割视频编码运动估计

2008 IEEE Workshop on Signal Processing Systems Pub Date : 2008-11-17 DOI: 10.1109/SIPS.2008.4671750

Yi-Shiou Luo, M. Celenk

引用次数: 5

Fast multiple reference frame selection methods for H.264/AVC H.264/AVC的快速多参考帧选择方法

2008 IEEE Workshop on Signal Processing Systems Pub Date : 2008-11-17 DOI: 10.1109/SIPS.2008.4671751

Shin Wang Ho, S. D. Kim, M. Sunwoo

引用次数: 3

A unified instruction set programmable architecture for multi-standard advanced forward error correction 一种统一的指令集可编程多标准高级前向纠错体系结构

2008 IEEE Workshop on Signal Processing Systems Pub Date : 2008-11-17 DOI: 10.1109/SIPS.2008.4671733

F. Naessens, B. Bougard, Siebert Bressinck, L. Hollevoet, P. Raghavan, L. Perre, F. Catthoor

{"title":"A unified instruction set programmable architecture for multi-standard advanced forward error correction","authors":"F. Naessens, B. Bougard, Siebert Bressinck, L. Hollevoet, P. Raghavan, L. Perre, F. Catthoor","doi":"10.1109/SIPS.2008.4671733","DOIUrl":"https://doi.org/10.1109/SIPS.2008.4671733","url":null,"abstract":"The continuously increasing number of communication standards to be supported in nomadic devices combined with the fast ramping design cost in deep submicron technologies claim for highly reusable and flexible programmable solutions. Software defined radio (SDR) aims at providing such solutions in radio baseband architectures. Great advances were recently booked in handset-targeted SDR, covering most of the baseband processing with satisfactory performance and energy efficiency. However, as it typically depicts a magnitude higher computation load, forward error correction (FEC) has been excluded from the scope of high throughput SDR solutions and let to dedicated hardware accelerators. The currently growing number of advanced FEC options claims however for flexibility there too. This paper presents the first application-specific instruction programmable architecture addressing in a unified way the emerging turbo- and LPDC coding requirements of 3GPP-LTE, IEEE802.11n, IEEE802.16(e) and DVB-S2/T2. The proposal shows a throughput from 0.07 to 1.25 Mbps/MHz with efficiencies round 0.32 nJ/bit/iter in turbo mode and round 0.085 nJ/bit/iter in LDPC mode. The area is lower than the cumulated area of dedicated turbo and LDPC solution.","PeriodicalId":173371,"journal":{"name":"2008 IEEE Workshop on Signal Processing Systems","volume":"284 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2008-11-17","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"133484703","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 47

Application-driven adaptive fixed-point refinement for SDRs 应用驱动的sdr自适应定点细化

2008 IEEE Workshop on Signal Processing Systems Pub Date : 2008-11-17 DOI: 10.1109/SIPS.2008.4671770

D. Novo, Min Li, B. Bougard, F. Naessens, L. Perre, F. Catthoor

{"title":"Application-driven adaptive fixed-point refinement for SDRs","authors":"D. Novo, Min Li, B. Bougard, F. Naessens, L. Perre, F. Catthoor","doi":"10.1109/SIPS.2008.4671770","DOIUrl":"https://doi.org/10.1109/SIPS.2008.4671770","url":null,"abstract":"Wireless interfaces implement and increasing number of different standards. For cost effectiveness, flexible radio implementations are preferred over the multiplication of dedicated solutions. Software Defined Radios (SDR) have been introduced as the ultimate way to achieve such flexibility. However, the reduced energy budget required by battery-powered solutions makes the typical worst-case static dimensioning unaffordable under highly dynamic operating conditions. Instead, energy-scalable algorithms and implementations are entailed to provide flexibility while maintaining the required energy efficiency. Particularly, energy-scalable implementations can exploit data-format properties to offer different tradeoffs between accuracy and energy. In this paper, an application-driven adaptive fixed-point refinement methodology is proposed. The latter derives the minimum word-lengths which respect a user-defined degradation on the application performance. This technique is applied to the fixed-point refinement of a Near-ML MIMO (Multiple Inputs, Multiple Outputs) detector. Variations on the minimum required precision depending on external conditions are made explicit. Finally, on a processor platform these variations can be translated into reduced cycles and energy by leveraging on sub-word parallel implementations.","PeriodicalId":173371,"journal":{"name":"2008 IEEE Workshop on Signal Processing Systems","volume":"471 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2008-11-17","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"131990045","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 0

Unified decoder architecture for LDPC/turbo codes LDPC/turbo码的统一解码器架构

2008 IEEE Workshop on Signal Processing Systems Pub Date : 2008-11-17 DOI: 10.1109/SIPS.2008.4671730

Yang Sun, Joseph R. Cavallaro

{"title":"Unified decoder architecture for LDPC/turbo codes","authors":"Yang Sun, Joseph R. Cavallaro","doi":"10.1109/SIPS.2008.4671730","DOIUrl":"https://doi.org/10.1109/SIPS.2008.4671730","url":null,"abstract":"Low-density parity-check (LDPC) codes on par with convolutional turbo codes (CTC) are two of the most powerful error correction codes known to perform very close to the Shannon limit. However, their different code structures usually lead to different hardware implementations. In this paper, we propose a unified decoder architecture that is capable of decoding both LDPC and turbo codes with a limited hardware overhead. We employ maximum a posteriori (MAP) algorithm as a bridge between LDPC and turbo codes. We represent LDPC codes as parallel concatenated single parity check (PCSPC) codes and propose a group sub-trellis (GST) decoding algorithm for the efficient decoding of PCSPC codes. This algorithm achieves about 2X improvement in the convergence speed and is more numerically robust than the classical ldquotanhrdquo algorithm. What is more interesting is that we can generalize a unified trellis decoding algorithm for LDPC and turbo codes based on their trellis structures. We propose a reconfigurable computation kernel for log-MAP decoding of LDPC and turbo codes at a cost of ~15% hardware overhead. Small lookup tables (LUTs) with 9 entries of 2-bit data are designed to implement the log-MAP algorithm. Fixed point (6:2) simulation results show that there is negligible or nearly no performance loss by using this LUT approximation compared to the ideal case. The proposed architecture results in scalable and flexible datapath units enabling parallel decoding of LDPC/turbo codes.","PeriodicalId":173371,"journal":{"name":"2008 IEEE Workshop on Signal Processing Systems","volume":"10 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2008-11-17","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"130518841","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 25