{"title":"A Hardware/Software Co-reconfigurable Multimedia Architecture","authors":"Yong-Kyu Jung","doi":"10.1109/ESTMED.2006.321277","DOIUrl":"https://doi.org/10.1109/ESTMED.2006.321277","url":null,"abstract":"The hardware/software co-reconfiguration technique is introduced to design a reconfigurable multimedia architecture that does not employ field-programmable devices. This co-reconfiguration technique does not require modifying existing compilers to retarget their new multimedia processors. This technique allows software developers to rapidly retarget their multimedia processors. In order to present the reconfiguration procedures and performance evaluations of the technique, a smart instruction decoder for Texas Instruments OMAP2420 was implemented and optimized","PeriodicalId":266183,"journal":{"name":"2006 IEEE/ACM/IFIP Workshop on Embedded Systems for Real Time Multimedia","volume":"35 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2006-10-26","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"124014575","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Akash Kumar, B. Mesman, B. Theelen, H. Corporaal, Yajun Ha
{"title":"Resource Manager for Non-preemptive Heterogeneous Multiprocessor System-on-chip","authors":"Akash Kumar, B. Mesman, B. Theelen, H. Corporaal, Yajun Ha","doi":"10.1109/ESTMED.2006.321271","DOIUrl":"https://doi.org/10.1109/ESTMED.2006.321271","url":null,"abstract":"Increasingly more MPSoC platforms are being developed to meet the rising demands from concurrently executing applications. These systems are often heterogeneous with the use of dedicated IP blocks and application domain specific processors. While there is a host of research done to provide good performance guarantees and to analyze applications for preemptive uniprocessor systems, the field of heterogeneous, non-preemptive MPSoCs is a mostly unexplored territory. In this paper, we propose to use a resource manager (RM) to improve the resource utilization of these systems. The basic functionalities of such a component are introduced. A high-level simulation model of such a system is developed to study the performance of RM, and a case study is performed for a system running an H.263 and a JPEG decoder. The case study illustrates at what control granularity a resource manager can effectively regulate the progress of applications such that they meet their performance requirements","PeriodicalId":266183,"journal":{"name":"2006 IEEE/ACM/IFIP Workshop on Embedded Systems for Real Time Multimedia","volume":"128 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2006-10-26","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"132597203","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Use of a Bit-true Data Flow Analysis for Processor-Specific Source Code Optimization","authors":"H. Falk, J. Wagner, André Schaefer","doi":"10.1109/ESTMED.2006.321286","DOIUrl":"https://doi.org/10.1109/ESTMED.2006.321286","url":null,"abstract":"Nowadays, key characteristics of a processor's instruction set are only exploited in high-level languages by using inline assembly or compiler intrinsics. Inserting intrinsics into the source code is up to the programmer, since only few automatic approaches exist. Additionally, these approaches base on simple code pattern matching strategies. This paper presents techniques for processor-specific code analysis and optimization at the source-level. It is shown how a bit-true dataflow analysis is made applicable for source code analysis for the TI C6x DSPs for the very first time. Based on this bit-true analysis, fully automated optimizations superior to conventional pattern matching techniques are presented which optimize saturated arithmetic, reduce bitwidths of variables and exploit SIMD data processing within source codes. The application of our implemented algorithms to complex real-life codes leads to speed-ups between 33%-48% for the optimization of saturated arithmetic, and up to 16% after SIMD optimization","PeriodicalId":266183,"journal":{"name":"2006 IEEE/ACM/IFIP Workshop on Embedded Systems for Real Time Multimedia","volume":"31 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2006-10-26","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"133248881","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Donghoon Lee, T. Ishihara, Masanori Muroyama, H. Yasuura, F. Fallah
{"title":"An Energy Characterization Framework for Software-Based Embedded Systems","authors":"Donghoon Lee, T. Ishihara, Masanori Muroyama, H. Yasuura, F. Fallah","doi":"10.1109/ESTMED.2006.321275","DOIUrl":"https://doi.org/10.1109/ESTMED.2006.321275","url":null,"abstract":"This paper proposes an energy characterization framework which helps designers in developing a fast and accurate energy model for a target processor-based system. We use a linear model for energy estimation and we find the coefficients of the model using linear programming (LP). We use our approach for estimating the energy consumption of two commercial microprocessors with their on-chip caches and an off-chip SDRAM. Experimental results demonstrate that the error of our technique is on an average 3% and worst case 16% compared to the gate-level estimation results. Once the model has been developed, the energy consumption of an application program can be estimated with the speed of 300,000 instructions per second","PeriodicalId":266183,"journal":{"name":"2006 IEEE/ACM/IFIP Workshop on Embedded Systems for Real Time Multimedia","volume":"52 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2006-10-26","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"129356415","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"HW/SW Co-Design and Implementation of Multi-Standard Video Decoding","authors":"Liu Feng, Guo Rui, Shi Shu, Cheng Xu","doi":"10.1109/ESTMED.2006.321279","DOIUrl":"https://doi.org/10.1109/ESTMED.2006.321279","url":null,"abstract":"In this paper, we present a design and implementation of multi-standard video decoder, which adopts the principle of HW/SW cooperation to achieve real time video decoding process. Based on the profiling of MPEG-1/2/4 video decoding algorithms, the computational intensive IDCT and sub-pixel interpolation are figured out to implement with hardware, and the dedicated DMA channels are provided to fulfil the high throughput of MC processing. The remained decoding functions are realized with software based on a RISC CPU. The design shares the advantage of high flexibility to fulfil multi-standard processing. With the assistant hardware accelerating, the proposed video decoder can achieve the MPEG-1/2/4 D1 size (720times480) video decoding at 30 fps","PeriodicalId":266183,"journal":{"name":"2006 IEEE/ACM/IFIP Workshop on Embedded Systems for Real Time Multimedia","volume":"os-46 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2006-10-26","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"127785735","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
M. Thompson, A. Pimentel, Simon Polstra, Cagkan Erbas
{"title":"A Mixed-level Co-simulation Method for System-level Design Space Exploration","authors":"M. Thompson, A. Pimentel, Simon Polstra, Cagkan Erbas","doi":"10.1109/ESTMED.2006.321270","DOIUrl":"https://doi.org/10.1109/ESTMED.2006.321270","url":null,"abstract":"The Sesame modeling and simulation framework aims at efficient system-level design space exploration of embedded multimedia systems. A primary objective of Sesame is the exploration at multiple levels of abstraction. As such, it targets gradual refinement of its (initially abstract) architecture performance models while maintaining architecture-independent application specifications. In this paper, we present a mixed-level co-simulation method, called trace calibration, for incorporating external simulators into Sesame's abstract system-level performance models. We show that trace calibration only requires minor modification of the incorporated simulators and that performance overheads due to co-simulation are minimal. Also, we show that trace calibration transparently supports distributed co-simulation, allowing for effectively reducing the system-level simulation slowdown due to the incorporation of lower-level simulators","PeriodicalId":266183,"journal":{"name":"2006 IEEE/ACM/IFIP Workshop on Embedded Systems for Real Time Multimedia","volume":"4 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2006-10-26","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"130053469","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Jisu Kim, Hyuk-Jae Lee, Tae-Ho Lee, Myungje Cho, Jae-Beom Lee
{"title":"Hardware/Software Partitioned Implementation of Real-time Object-oriented Camera for Arbitrary-shaped MPEG-4 Contents","authors":"Jisu Kim, Hyuk-Jae Lee, Tae-Ho Lee, Myungje Cho, Jae-Beom Lee","doi":"10.1109/ESTMED.2006.321267","DOIUrl":"https://doi.org/10.1109/ESTMED.2006.321267","url":null,"abstract":"Recently developed MPEG-4 part 2 compression standard provides a novel capability to handle arbitrary video objects. To support this capability, an efficient object segmentation technique is required. This paper proposes a real-time algorithm for foreground object segmentation in video sequences. The proposed algorithm consists of two steps: the first step that segments a frame into several sub-regions using spatio-temporal watershed transform and the second one that extracts a foreground object segment from the sub-regions generated in the first step. For real-time processing, the algorithm is partitioned into hardware and software parts so that computationally expensive parts are off-loaded from a processor and executed by hardware accelerators. Simulation results show that the proposed implementation can handle QCIF-size video at 15 fps to extract an accurate foreground object","PeriodicalId":266183,"journal":{"name":"2006 IEEE/ACM/IFIP Workshop on Embedded Systems for Real Time Multimedia","volume":"9 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2006-10-26","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"129409476","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Bart de Ruijsscher, G. Gaydadjiev, J. Lichtenauer, E. Hendriks
{"title":"FPGA accelerator for real-time skin segmentation","authors":"Bart de Ruijsscher, G. Gaydadjiev, J. Lichtenauer, E. Hendriks","doi":"10.1109/ESTMED.2006.321280","DOIUrl":"https://doi.org/10.1109/ESTMED.2006.321280","url":null,"abstract":"Many real-time image processing applications are confronted with performance limitations when implemented in software. The skin segmentation algorithm utilized in hand gesture recognition as developed by the ICT department of Delft University of Technology presents an example of such an application. This paper presents the design of an FPGA based accelerator which alleviates the host PC's computational effort required for real-time skin segmentation. We show that our design utilizes no more than 88% of the resources available within the targeted XC2VP30 device. In addition, the proposed approach is highly portable and not limited to the considered real-time image processing algorithm only","PeriodicalId":266183,"journal":{"name":"2006 IEEE/ACM/IFIP Workshop on Embedded Systems for Real Time Multimedia","volume":"2 7","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2006-10-26","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"131747351","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Neighbors-on-Path: A New Selection Strategy for On-Chip Networks","authors":"G. Ascia, V. Catania, M. Palesi, Davide Patti","doi":"10.1109/ESTMED.2006.321278","DOIUrl":"https://doi.org/10.1109/ESTMED.2006.321278","url":null,"abstract":"Efficient and deadlock-free routing is critical to the performance of networks-on-chip. In this paper we present an approach that can be coupled to any adaptive routing algorithm to improve the performance with a minimal overhead on area and energy consumption. The proposed approach introduces the concept of neighbors-on-path to exploit the situations of indecision occurring when the routing function returns several admissible output channels. A selection strategy is developed with the aim to choose the channel that will allow the packet to be routed to its destination along a path that is as free as possible of congested nodes. Performance evaluation is carried out by using a flit-accurate simulator on traffic scenarios generated by both synthetic and real applications. Results obtained show how the proposed selection policy applied to the odd-even routing algorithm outperforms other deterministic and adaptive routing algorithms both in average delay and energy consumption","PeriodicalId":266183,"journal":{"name":"2006 IEEE/ACM/IFIP Workshop on Embedded Systems for Real Time Multimedia","volume":"247 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2006-10-26","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"124715848","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Loop Nest Splitting for WCET-Optimization and Predictability Improvement","authors":"H. Falk, M. Schwarzer","doi":"10.1109/ESTMED.2006.321283","DOIUrl":"https://doi.org/10.1109/ESTMED.2006.321283","url":null,"abstract":"This paper presents the effect of the loop nest splitting source code optimization on worst-case execution time (WCET). Loop nest splitting minimizes the number of executed if-statements in loop nests of multimedia applications. It identifies iterations where all if-statements are satisfied and splits the loop nest such that if-statements are not executed at all for large parts of the loop nest's iteration space. Especially loops and if-statements are an inherent source of unpredictability and loss of precision for WCET analysis. This is caused by the difficulty to obtain safe and tight worst-case estimates of an application's high-level control flow. In addition, assembly-level control flow redirections reduce predictability even more due to complex processor pipelines and branch prediction units. Loop nest splitting bases on precise mathematical models combined with genetic algorithms. On the one hand, these techniques achieve a significantly more homogeneous control flow structure. On the other hand, the precision of our analyses enables to generate very accurate high-level flow facts for loops and if-statements. The application of our implemented algorithms to three real-life benchmarks leads to average speed-ups by 25.0%-30.1%, while WCET is reduced by 34.0%-36.3%","PeriodicalId":266183,"journal":{"name":"2006 IEEE/ACM/IFIP Workshop on Embedded Systems for Real Time Multimedia","volume":"25 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2006-10-26","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"121675064","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}