{"title":"Case study: Functional verification of a reconfigurable systolic array using truss","authors":"Myoung-Keun You, Young-Jin Oh, Gi-Yong Song","doi":"10.1109/ASICON.2009.5351301","DOIUrl":"https://doi.org/10.1109/ASICON.2009.5351301","url":null,"abstract":"This paper introduces our experience in verifying the operation of each systolic array before and after reconfiguration using Truss. Truss is an implementation of an open-source verification infrastructure based on layer approach. Reconfigurable systolic array for solving either single-source shortest path problem or 0–1 knapsack problem is chosen as a reconfigurable device-under-test. One systolic array can be reconfigured into the other and vice versa according to the problem. The functional verification is performed on a reconfigurable device-under-test using Truss configured to this specific hardware.1","PeriodicalId":446584,"journal":{"name":"2009 IEEE 8th International Conference on ASIC","volume":"19 5","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2009-12-11","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"120874028","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Scalable and unified hardware architecture for montgomery inversion computation in GF(p) and GF(2n)","authors":"Yang Xiao-hui, Qin Fan, Dai Zibin, Zhang Yong-fu","doi":"10.1109/ASICON.2009.5351562","DOIUrl":"https://doi.org/10.1109/ASICON.2009.5351562","url":null,"abstract":"Computing the inverse of a number in finite fields GF(p) or GF(2n) is equally important for cryptographic applications. In this paper four optimized Montgomery inverse algorithms are proposed to achieve high speed and flexibility. Then a novel scalable and unified architecture for Montgomery inverse hardware that operates in both GF(p) and GF(2n) is proposed. The scalable design is the novel modification performed on the fixed hardware to make it occupy a small area and operate with better or similar speed, and it takes less number of clock cycle as the datapath of scalable design is large and can also achieve high clock frequency. Finally this work has been verified by modeling it in Verilog-HDL, implementing it under 0.18µm SMIC technology. The result indicates that our work has advanced performance than other works.","PeriodicalId":446584,"journal":{"name":"2009 IEEE 8th International Conference on ASIC","volume":"21 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2009-12-11","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"126156672","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"All digital wireless transceiver using modified BPSK and 2/3 sub-sampling technique","authors":"Sanad Bushnaq, T. Nakura, M. Ikeda, K. Asada","doi":"10.1109/ASICON.2009.5351341","DOIUrl":"https://doi.org/10.1109/ASICON.2009.5351341","url":null,"abstract":"In this paper an all digital wireless transceiver is presented, with a proposed technique for data recovery. Communication is carried out on carrier frequency of 100 MHz with a local clock of 2/3 carrier frequency on the receiver side used to sub-sample incoming wireless signal and understand transmitted data. A modified BPSK is employed, which stretches periods of phase change to enable data recovery in our all digital circuit without clock recovery. The transceiver is implemented and tested on FPGA connected to coils to perform actual short range wireless communication. Our design uses no analog components and our target is to consume as low power as possible, which makes it suitable for low power applications like wireless image sensor nodes.","PeriodicalId":446584,"journal":{"name":"2009 IEEE 8th International Conference on ASIC","volume":"87 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2009-12-11","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"124997566","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Automatic parallelization experiments on 16PE NOC based MPSOC","authors":"G. Tian, O. Hammami","doi":"10.1109/ASICON.2009.5351532","DOIUrl":"https://doi.org/10.1109/ASICON.2009.5351532","url":null,"abstract":"Multi-Processors System on Chip (MPSOC) is emerging as solutions for high performance embedded systems. Although important work have been achieved in the design and implementation of such systems the issue of parallel software design have not yet been properly evaluated for these targets. We present in this work automatic parallelization experiment results on a 16PE NOC based MPSOC which we designed and implemented on a single FPGA chip. All reported results come from actual execution and show that speed-up becomes limited beyond 8 processors in this external memory constrained environment.","PeriodicalId":446584,"journal":{"name":"2009 IEEE 8th International Conference on ASIC","volume":"23 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2009-12-11","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"125016111","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Power-aware FPGA packing algorithm","authors":"M. Yang, Hongying Xu, A. Almaini","doi":"10.1109/ASICON.2009.5351571","DOIUrl":"https://doi.org/10.1109/ASICON.2009.5351571","url":null,"abstract":"Field-Programmable Gate Array (FPGA) packing is one of abstraction levels in the FPGA CAD design flow which is aimed to pack logic components into clusters. As a result, the cluster-based FPGA can significantly improve timing, routability and power consumption as well. This paper proposes a novel packing algorithm using priori wire length estimation before actual routing taken apart. In addition, global placement was taken apart before packing to have additional placement information, which is also guided for the algorithm to selectively pack closely related module into one cluster. Experimental results show that power-aware packing algorithm achieves 5% power reduction on average compared to traditional algorithm1.","PeriodicalId":446584,"journal":{"name":"2009 IEEE 8th International Conference on ASIC","volume":"40 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2009-12-11","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"129400381","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Zhou Chen, Yulong Zhang, Yan Ying, Chuan Wu, Xiaoyang Zeng
{"title":"An area-efficient and degree-computationless BCH decoder for DVB-S2","authors":"Zhou Chen, Yulong Zhang, Yan Ying, Chuan Wu, Xiaoyang Zeng","doi":"10.1109/ASICON.2009.5351625","DOIUrl":"https://doi.org/10.1109/ASICON.2009.5351625","url":null,"abstract":"This paper presents an area-efficient BCH decoder for DVB-S2 system. The proposed architecture can support all 11 code rates in DVB-S2. Based on the modified Euclidean algorithm (MEA), The BCH decoder has a low hardware complexity with the folding and degree computationless architecture in key equation solver (KES) block. Further more, the multiplier in Galois Field is also optimized to reduce the hardware complexity. The proposed decoder requires at least 16% fewer gates than the conventional RS/BCH decoders and can work up to 277MHz, which meets the speed requirements of the system1.","PeriodicalId":446584,"journal":{"name":"2009 IEEE 8th International Conference on ASIC","volume":"48 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2009-12-11","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"128370094","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Dongfang Cheng, Zhifei Yi, Weiyong Wang, Xiaohui Li
{"title":"A switched Hall IC for automotive electronic applications","authors":"Dongfang Cheng, Zhifei Yi, Weiyong Wang, Xiaohui Li","doi":"10.1109/ASICON.2009.5351534","DOIUrl":"https://doi.org/10.1109/ASICON.2009.5351534","url":null,"abstract":"This paper presents a design of a switched Hall IC, which can operate in automotive electronic applications with wide range of voltage and temperature from 3.8V to 30V and −40°C to +150°C, respectively. Besides, for a hysteretic comparator is contained in the signal processing circuit, it can be used with hysteretic loop-line width of 0.456V and with trigger and release magnetic field of 14.5mT and 11.5mT, respectively. The whole design are simulated and verified by 4um bipolar technology, and the simulation results prove the validity of our design.","PeriodicalId":446584,"journal":{"name":"2009 IEEE 8th International Conference on ASIC","volume":"24 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2009-12-11","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"129549606","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"A fully pipelined CABAC coder using syntax element instructions driving","authors":"Shenggang Chen, Shuming Chen, X. Ning","doi":"10.1109/ASICON.2009.5351580","DOIUrl":"https://doi.org/10.1109/ASICON.2009.5351580","url":null,"abstract":"Context-based Adaptive Binary Arithmetic Coder (CABAC) is an essential part in the H.264 main profile video encoder to generate final bitstream. With the development of large-scale parallel H.264 encoder and the high definition video requirement, it increasingly poses a bottleneck in the video encoding path of the parallel encoders. This paper proposes a fully pipelined hardware CABAC coder to speed up the bitstream generation, which is suitable for accelerating a node processor in a manycore chip. The coder employs a CPU-like execution style and using the Syntax Elements Instructions (SEI) to drive the pipeline. Synthesis results with SIMC 0.13um technology show that with an area of 3.21K logic gates, 3.5K RAM bits and 34.375K ROM bits, this design can achieve a high throughput of 590Mbps, basically supporting the real-time HD video coding1.","PeriodicalId":446584,"journal":{"name":"2009 IEEE 8th International Conference on ASIC","volume":"501 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2009-12-11","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"127042760","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Geometry optimization of SiGe HBTs for noise performance of the monolithic Low noise amplifier","authors":"Pei Shen, Wanrong Zhang, Hongyun Xie, D. Jin","doi":"10.1109/ASICON.2009.5351284","DOIUrl":"https://doi.org/10.1109/ASICON.2009.5351284","url":null,"abstract":"The influence of various geometry sizes of SiGe HBTs on noise performance of the monolithic Low noise amplifiers(LNAs) is investigated in this paper. Four types of LNAs using SiGe HBTs with different emitter widths, emitter lengths and emitter strip numbers are fabricated in a 0.35-µm Si BiCMOS process technology. The die areas are only 0.2mm<sup>2</sup> due to the absence of inductors. The noise figure(NF), associated gain(G<inf>A</inf>) and the optimum source resistance(R<inf>s,opt</inf>) of the LNAs are compared. Simplified analytical expressions of NF and R<inf>s,opt</inf> are presented to give additional insight. Geometry scaling data show that the LNA using SiGe HBT with A<inf>E</inf>=4×40×4µm<sup>2</sup> has the minimum NF of 2.7dB, the maximum gain of 26.7dB and the optimum R<inf>s,opt</inf> of nearly 50Ω compared to other devices geometries. These experiment results provide a guide of device geometry optimizing to develop monolithic LNA for lower noise application<sup>1</sup>.","PeriodicalId":446584,"journal":{"name":"2009 IEEE 8th International Conference on ASIC","volume":"1 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2009-12-11","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"130584546","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Xiuping Chen, Chuan Wu, Zhou Chen, Bo Shen, Xiaoyang Zeng
{"title":"A novel synchronization scheme for OFDM-based CMMB receivers","authors":"Xiuping Chen, Chuan Wu, Zhou Chen, Bo Shen, Xiaoyang Zeng","doi":"10.1109/ASICON.2009.5351635","DOIUrl":"https://doi.org/10.1109/ASICON.2009.5351635","url":null,"abstract":"A novel synchronization scheme aiming at the special frame structure of China mobile multimedia broadcasting (CMMB) is presented in this paper. Based on the fact that CMMB system is very sensitive to the synchronization error, this scheme gives a solution which integrates high precision symbol timing recovery, carrier frequency recovery and sampling clock frequency recovery. Simulation results show that proposed algorithm can perform robustly at −2dB SNR and TU6 channel with 300Hz Doppler frequency shift. The root mean square error (RMSE) of sampling clock frequency offset estimation and residual carrier frequency offset estimation is 50% less than that of conventional scheme.1","PeriodicalId":446584,"journal":{"name":"2009 IEEE 8th International Conference on ASIC","volume":"11 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2009-12-11","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"130220835","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}