International Conference on Hardware/Software Codesign and System Synthesis最新文献_第5页

Building heterogeneous reconfigurable systems with a hardware microkernel 用硬件微内核构建异构可重构系统

International Conference on Hardware/Software Codesign and System Synthesis Pub Date : 2009-10-11 DOI: 10.1145/1629435.1629489

J. Agron, D. Andrews

{"title":"Building heterogeneous reconfigurable systems with a hardware microkernel","authors":"J. Agron, D. Andrews","doi":"10.1145/1629435.1629489","DOIUrl":"https://doi.org/10.1145/1629435.1629489","url":null,"abstract":"Field Programmable Gate Arrays (FPGAs) have long held the promise of allowing designers to create systems with performance levels close to custom circuits but with a softwarelike productivity for reconfiguring the gates. Unfortunately achieving this promise has been elusive. Modern platform FPGAs are now large enough to support complete heterogeneous Multiprocessor System-On-Chips (MPSoCs), however standardized design flows and programming models for such platforms do not yet exist. To achieve true softwarelike levels of productivity, the design flow and development environment for heterogeneous MPSoCs must resemble that of standard homogeneous systems. In this paper we present a new design flow and run-time system that enables developers to program a heterogeneous MPSoC using standard POSIX-compatible programming abstractions. The ability to use a standard programming model is achieved by using a hardware-based microkernel to provide OS services to all heterogeneous components. This approach makes programming heterogeneous MPSoCs transparent, and can increase programmer productivity by replacing synthesis of custom components with faster compilation of heterogeneous executables. The use of a hardware microkernel provides OS services in an ISA-neutral manner, which allows for seamless synchronization and communication amongst heterogeneous threads.","PeriodicalId":300268,"journal":{"name":"International Conference on Hardware/Software Codesign and System Synthesis","volume":"4 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2009-10-11","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"130827738","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 19

SARA: StreAm register allocation 流寄存器分配

International Conference on Hardware/Software Codesign and System Synthesis Pub Date : 2009-10-11 DOI: 10.1145/1629435.1629442

P. Raghavan, F. Catthoor

引用次数: 0

Stack oriented data cache filtering 面向堆栈的数据缓存过滤

International Conference on Hardware/Software Codesign and System Synthesis Pub Date : 2009-10-11 DOI: 10.1145/1629435.1629472

Rodrígo González-Alberquilla, Fernando Castro, L. Piñuel, F. Tirado

引用次数: 5

Portable SystemC-on-a-chip 便携式SystemC-on-a-chip

International Conference on Hardware/Software Codesign and System Synthesis Pub Date : 2009-10-11 DOI: 10.1145/1629435.1629439

Scott Sirowy, Bailey Miller, F. Vahid

引用次数: 9

TotalProf: a fast and accurate retargetable source code profiler TotalProf:一个快速和准确的可重新定位的源代码分析器

International Conference on Hardware/Software Codesign and System Synthesis Pub Date : 2009-10-11 DOI: 10.1145/1629435.1629477

L. Gao, Jia Huang, J. Ceng, R. Leupers, G. Ascheid, H. Meyr

{"title":"TotalProf: a fast and accurate retargetable source code profiler","authors":"L. Gao, Jia Huang, J. Ceng, R. Leupers, G. Ascheid, H. Meyr","doi":"10.1145/1629435.1629477","DOIUrl":"https://doi.org/10.1145/1629435.1629477","url":null,"abstract":"Profilers play an important role in software/hardware design, optimization, and verification. Various approaches have been proposed to implement profilers. The most widespread approach adopted in the embedded domain is Instruction Set Simulation (ISS) based profiling, which provides uncompromised accuracy but limited execution speed. Source code profilers, on the contrary, are fast but less accurate. This paper introduces TotalProf, a fast and accurate source code cross profiler that estimates the performance of an application from three aspects: First, code optimization and a novel virtual compiler backend are employed to resemble the course of target compilation. Second, an optimistic static scheduler is introduced to estimate the behavior of the target processor's datapath. Last but not least, dynamic events, such as cache misses, bus contention and branch prediction failures, are simulated at runtime. With an abstract architecture description, the tool can be easily retargeted in a performance characteristics oriented way to estimate different processor architectures, including DSPs and VLIW machines. Multiple instances of TotalProf can be integrated with SystemC to support heterogeneous Multi-Processor System-on-Chip (MPSoC) profiling. With only about a 5 to 15% error rate introduced to the major performance metrics, such as cycle count, memory accesses and cache misses, a more than one Giga-Instruction-Per-Second (GIPS) execution speed is achieved.","PeriodicalId":300268,"journal":{"name":"International Conference on Hardware/Software Codesign and System Synthesis","volume":"69 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2009-10-11","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"131070714","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 22

On-the-fly hardware acceleration for protocol stack processing in next generation mobile devices 下一代移动设备协议栈处理的实时硬件加速

International Conference on Hardware/Software Codesign and System Synthesis Pub Date : 2009-10-11 DOI: 10.1145/1629435.1629457

David Szczesny, S. Hessel, Felix Bruns, A. Bilgic

{"title":"On-the-fly hardware acceleration for protocol stack processing in next generation mobile devices","authors":"David Szczesny, S. Hessel, Felix Bruns, A. Bilgic","doi":"10.1145/1629435.1629457","DOIUrl":"https://doi.org/10.1145/1629435.1629457","url":null,"abstract":"In this paper we present a new on-the-fly hardware acceleration approach, based on a smart Direct Memory Access (sDMA) controller, for the layer 2 (L2) downlink protocol stack processing in Long Term Evolution (LTE) and beyond mobile devices. We use virtual prototyping in order to simulate an ARM1176 processor based hardware platform together with the executed software comprising an LTE protocol stack model. The sDMA controller with diff erent hardware accelerator units for the time critical algorithms in the protocol stack is implemented and integrated in the hardware platform. We prove our new hardware/software partitioning concept for the LTE L2 by measuring the average execution time per transport block in the protocol stack at di fferent activated on-the-fly hardware acceleration stages in the sDMA controller. At LTE data rates of 100 Mbit/s, we achieve a speedup of 24% compared to a pure software implementation by enabling the sDMA hardware support for header processing in the protocol stack. Furthermore, an activation of the complete on-the-fly hardware acceleration in the sDMA controller, including on-the-fly deciphering, leads to a speedup of more than 50 %. Finally, at transmission conditions with more computational demands and data rates up to 320 Mbit/s, we obtain acceleration ratios of almost 80 %. Investigations show that our new sDMA on-the-fly hardware acceleration approach in combination with a single-core processor off ers the required computational power for next generation mobile devices.","PeriodicalId":300268,"journal":{"name":"International Conference on Hardware/Software Codesign and System Synthesis","volume":"5 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2009-10-11","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"133431268","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 11

Configuration and control of SystemC models using TLM middleware 使用TLM中间件配置和控制SystemC模型

International Conference on Hardware/Software Codesign and System Synthesis Pub Date : 2009-10-11 DOI: 10.1145/1629435.1629447

C. Schröder, Wolfgang Klingauf, Robert Günzel, M. Burton, Eric Roesler

引用次数: 11

Efficient dynamic voltage/frequency scaling through algorithmic loop transformation 通过算法环变换实现高效的动态电压/频率缩放

International Conference on Hardware/Software Codesign and System Synthesis Pub Date : 2009-10-11 DOI: 10.1145/1629435.1629464

M. Ghodrat, T. Givargis

引用次数: 2

Exploring hybrid photonic networks-on-chip foremerging chip multiprocessors 探索片上混合光子网络-融合片上多处理器

International Conference on Hardware/Software Codesign and System Synthesis Pub Date : 2009-10-11 DOI: 10.1145/1629435.1629453

Shirish Bahirat, S. Pasricha

{"title":"Exploring hybrid photonic networks-on-chip foremerging chip multiprocessors","authors":"Shirish Bahirat, S. Pasricha","doi":"10.1145/1629435.1629453","DOIUrl":"https://doi.org/10.1145/1629435.1629453","url":null,"abstract":"Increasing application complexity and improvements in process technology have today enabled chip multiprocessors (CMPs) with tens to hundreds of cores on a chip. Networks on Chip (NoCs) have emerged as scalable communication fabrics that can support high bandwidths for these massively parallel systems. However, traditional electrical NoC implementations still need to overcome the challenges of high data transfer latencies and large power consumption. On-chip photonic interconnects have recently been proposed as an alternative to address these challenges, with high performance-per-watt characteristics for intra-chip communication. In this paper, we explore using photonic interconnects on a chip to enhance traditional electrical NoCs. Our proposed hybrid photonic NoC utilizes a photonic ring waveguide to enhance a traditional 2D electrical mesh NoC. Experimental results indicate a strong motivation for considering the proposed hybrid photonic NoC for future CMPs -- as much as a 13× reduction in power consumption and improved throughput and access latencies, compared to traditional electrical 2D mesh and torus NoC architectures.","PeriodicalId":300268,"journal":{"name":"International Conference on Hardware/Software Codesign and System Synthesis","volume":"3 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2009-10-11","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"121449986","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 29

Memory-efficient distribution of regular expressions for fast deep packet inspection 用于快速深度包检测的正则表达式的内存高效分布

International Conference on Hardware/Software Codesign and System Synthesis Pub Date : 2009-10-11 DOI: 10.1145/1629435.1629456

J. Rohrer, K. Atasu, J. V. Lunteren, C. Hagleitner

引用次数: 30