{"title":"Automatic mapping of application to coarse-grained reconfigurable architecture based on high-level synthesis techniques","authors":"Ganghee Lee, Seokhyun Lee, Kiyoung Choi","doi":"10.1109/SOCDC.2008.4815655","DOIUrl":"https://doi.org/10.1109/SOCDC.2008.4815655","url":null,"abstract":"Coarse-grained reconfigurable architecture is good for both performance and flexibility. However, it is not easy to map applications to such architecture since it requires compilation of the application and configuration of the architecture at the same time while trying to maximally exploit the parallelism in the application and the architecture. In this paper, we introduce an approach to mapping applications to coarse-grained reconfigurable architecture based on high-level synthesis techniques. We adopt performance enhancing techniques including loop unrolling and loop pipelining for temporal mapping on a reconfigurable array architecture. Experimental results with DSPstone benchmark examples show the effectiveness of the proposed approach.","PeriodicalId":405078,"journal":{"name":"2008 International SoC Design Conference","volume":"33 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2008-11-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"134096176","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"The use of Fair Y-Sim for optimizing mapping set selection in hardware/software co-design","authors":"O. Adeluyi, Eun-ok Kim, J.A. Lee, Jeong-Gun Lee","doi":"10.1109/SOCDC.2008.4815712","DOIUrl":"https://doi.org/10.1109/SOCDC.2008.4815712","url":null,"abstract":"This paper proposes a new hardware/software partitioning and mapping procedure based on a Y-chart design approach for the partitioning of stream based real-time video signal processing algorithms. The approach of this paper is to ensure ldquofairnessrdquo in the hardware-software partitioning by increasing the capacity of application functions to become candidates for mapping cases through the iterative equalization of the execution times by sub-partitioning the functions with excessively long execution times. Then, a simulation tool called Fair Y-Sim (fairy-sim) is developed to streamline the mapping set to the best cases based on some pre-specified metrics. Our experimental results show that when this is done in tandem with the Heuristic Algorithm for Reducing Mapping Sets (HARMS) we can obtain a mapping set streamlining ratio of up to 4.83% of the best mapping cases, while eliminating 95.17% of the initial mapping set based on their throughput values.","PeriodicalId":405078,"journal":{"name":"2008 International SoC Design Conference","volume":"15 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2008-11-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"134322961","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Simplified NAL decoder for H.264/AVC baseline profile","authors":"Kwangrae Jeong, Jinha Choi, Jaeseok Kim","doi":"10.1109/SOCDC.2008.4815698","DOIUrl":"https://doi.org/10.1109/SOCDC.2008.4815698","url":null,"abstract":"In this paper, we propose the efficient NAL (Network Abstraction Layer) decoder for H.264/AVC baseline profile which is specified for mobile telecommunication. Therefore, we can simplify NAL decoder for baseline profile. Many parameters and complex algorithms are fixed or removed in baseline profile. This optimization approach can be defined by target based NAL decoder implementation. We discuss about this target based NAL decoder and implementation especially in H.264/AVC baseline profile.","PeriodicalId":405078,"journal":{"name":"2008 International SoC Design Conference","volume":"9 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2008-11-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"133815426","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Image quality enhancement by real-time gamma correction for a CMOS image sensor in mobile phones","authors":"H. Jeong, Joohyun Kim, Wontae Choi, B. Kang","doi":"10.1109/SOCDC.2008.4815699","DOIUrl":"https://doi.org/10.1109/SOCDC.2008.4815699","url":null,"abstract":"Real-time gamma correction is an essential function in display devices such as CRTs, plasma TVs, and TFT LCDs. In this paper, we present the image quality enhancement by real-time gamma correction for a CMOS image sensor in mobile phones. The proposed algorithm and system operate with a block structure and the bit width of input and output of the proposed system is 12-bit. The proposed gamma system is reduced the error range and enhanced the image quality compared with conventional system that has 10-bit input and 8-bit output. The proposed system was implemented experimentally by using Xilinx Virtex4 FPGA and was successfully demonstrated with a CMOS image sensor for mobile application.","PeriodicalId":405078,"journal":{"name":"2008 International SoC Design Conference","volume":"88 4","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2008-11-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"114021744","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Investigation of forward body bias effects on TSPC RF frequency dividers in 0.18 μm CMOS","authors":"Seungsoo Kim, Hyunchol Shin","doi":"10.1109/SOCDC.2008.4815659","DOIUrl":"https://doi.org/10.1109/SOCDC.2008.4815659","url":null,"abstract":"Effects of forward body biasing (FBB) is investigated as an effective mean of on-chip scaling of power consumption and operating speed in CMOS true single phase clock (TSPC) RF frequency divide-by-2 circuits. Through extensive dc and RF simulations in 0.18 mum CMOS, the effects of the forward body bias on the threshold voltage, propagation delay, and current dissipation are examined. Then, it is shown that only with the FBB voltage of 0.2 V, the divide-by-2 circuits achieves 22% and 21% improvements in the maximum operating speed while only at the cost of 15% and 32% more current dissipation for TSPC and extended TSPC (E-TSPC) type logics, respectively. We believe that the forward body biasing technique is instrumental in realizing on-chip on-the-fly scalable TSPC dividers for low power applications.","PeriodicalId":405078,"journal":{"name":"2008 International SoC Design Conference","volume":"13 37","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2008-11-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"114044467","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Hardware implementation of motion estimation using a sub-sampled block for frame rate up-conversion","authors":"Suk-ju Kang, D. Yoo, Sung-Kyu Lee, Young Hwan Kim","doi":"10.1109/SOCDC.2008.4815694","DOIUrl":"https://doi.org/10.1109/SOCDC.2008.4815694","url":null,"abstract":"In this paper, we present a new motion estimation hardware architecture using a sub-sampled block, which can be used for frame rate up-conversion. The proposed architecture provides the advantage of reducing computational hardware complexity greatly, compared to the conventional architecture, while maintaining the quality of interpolated images. FPGA implementation shows that the proposed motion estimation hardware architecture reduces the hardware size by 51%, compared to the conventional architecture at the cost of average PSNR degradation of only 0.22 dB for interpolated images.","PeriodicalId":405078,"journal":{"name":"2008 International SoC Design Conference","volume":"1 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2008-11-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"128722672","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"A 5-Gb/s continuous-time adaptive equalizer and CDR using 0.18μm CMOS","authors":"Tae-Ho Kim, Sang-Ho Kim, Jin-Ku Kang","doi":"10.1109/SOCDC.2008.4815681","DOIUrl":"https://doi.org/10.1109/SOCDC.2008.4815681","url":null,"abstract":"In this paper, a 5-Gb/s receiver with adaptive equalizer and clock and data recovery(CDR) for serial link interface is proposed. In order to operate adaptively at 5-Gb/s data rate, LMS algorithm uses two internal signals from slicers which does not have an effect on gain boosting performance. In addition, this scheme enables it to operate without passive filter since two internal signals of slicers has a similar DC magnitude. The proposed adaptive equalizer in this receiver can compensate up to 20-dB and operate in various environments, which are 15-m shield twisted pair(STP) cable for DisplayPort and flame retardant 4(FR-4) traces up to 60-inch adaptively. This work is implemented 0.18-mum 1-poly 4-metal CMOS technology. Power dissipation of the equalizer is only 6-mW and it occupies 200 mum times 350 mum. Total power dissipation of the combined CDR is 164-mW (including output buffers) and operating range is available up to 5-Gb/s.","PeriodicalId":405078,"journal":{"name":"2008 International SoC Design Conference","volume":"33 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2008-11-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"116309633","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Y. Hara, H. Tomiyama, S. Honda, H. Takada, K. Ishii
{"title":"Behavioral partitioning with exploiting function-level parallelism","authors":"Y. Hara, H. Tomiyama, S. Honda, H. Takada, K. Ishii","doi":"10.1109/SOCDC.2008.4815588","DOIUrl":"https://doi.org/10.1109/SOCDC.2008.4815588","url":null,"abstract":"This paper proposes a method to efficiently generate hardware from a large behavioral description by behavioral synthesis. For a program with functions which are executable in parallel, this proposed method determines an optimal behavioral partitioning which fully exploits the function-level parallelism with simultaneously minimizing the area in the datapath and control path. This partitioning problem is formulated as an integer programming problem. Experimental results demonstrate the effectiveness of our proposed method.","PeriodicalId":405078,"journal":{"name":"2008 International SoC Design Conference","volume":"1 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2008-11-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"115628842","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Multi-codec variable length decoder design with configurable processor","authors":"HyoukJoong Lee, Kiyoung Choi","doi":"10.1109/SOCDC.2008.4815594","DOIUrl":"https://doi.org/10.1109/SOCDC.2008.4815594","url":null,"abstract":"Multi-codec video decoder is widely used with increasing number of video standards. Although this trend requires flexible system design that can accommodate various standards, most Variable Length Decoders (VLDs) for video applications have been designed with the ASIC approach because of the poor performance of software implementation on a processor. This paper presents a design concept for a flexible VLD using configurable processor with additional custom instructions for acceleration. The simulation result shows that the proposed approach improves the performance by 4.68 ~ 5.59 times compared to that of a general purpose processor, enabling MPEG-4 SD video sequence to be decoded in real time on the processor. Our design is flexible in that any VLD process for various video standards can be executed on it without hardware modification.","PeriodicalId":405078,"journal":{"name":"2008 International SoC Design Conference","volume":"54 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2008-11-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"115239560","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"High speed 3D acquisition chip design for robot applications","authors":"Byung-Joo Hong, Seung-Hoon Lee, Jun-Dong Cho, Je-Hyuk Ryu","doi":"10.1109/SOCDC.2008.4815735","DOIUrl":"https://doi.org/10.1109/SOCDC.2008.4815735","url":null,"abstract":"Obtaining automatic 3D profile of object is one of the most important issues in a robot vision system. We adopt one of the signal separation coding methods called hierarchically orthogonal code (HOC) based on structured light in order to obtain robust depth imaging. To realize this algorithm, high-speed image processing is essential. Because this algorithm requires 17 raw-data pictures to get a picture containing depth information. Therefore, this paper introduces a high-speed hardware platform to perform 3D modeling. Firstly, we implement the platform using FPGA to verify the functionality of our design. Then, our design is fabricated using Samsung 0.18 um CMOS technology. For the chip test, FPGA-based testing board was connected with components for image sensing. The results show that it requires 58 ms to generate one 3D image in realtime. This processing time is 14.5 times faster than the same implementation using software.","PeriodicalId":405078,"journal":{"name":"2008 International SoC Design Conference","volume":"1 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2008-11-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"116099690","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}