International Conference on Hardware/Software Codesign and System Synthesis最新文献_第7页

Traversal caches: a first step towards FPGA acceleration of pointer-based data structures 遍历缓存:迈向FPGA加速指针数据结构的第一步

International Conference on Hardware/Software Codesign and System Synthesis Pub Date : 2008-10-19 DOI: 10.1145/1450135.1450150

G. Stitt, Gaurav Chaudhari, J. Coole

{"title":"Traversal caches: a first step towards FPGA acceleration of pointer-based data structures","authors":"G. Stitt, Gaurav Chaudhari, J. Coole","doi":"10.1145/1450135.1450150","DOIUrl":"https://doi.org/10.1145/1450135.1450150","url":null,"abstract":"Field-programmable gate arrays (FPGAs) often achieve order of magnitude speedups compared to microprocessors, but typically have been unable to improve the performance of applications with irregular memory access patterns, such as traversals of pointer-based data structures. Due to the common use of these data structures, the applicability and widespread success of FPGAs has been limited. In this paper, we introduce the traversal cache framework - a first step towards improving the performance of FPGA applications that utilize pointer-based data structures. The traversal cache is a local FPGA memory that stores repeated traversals of pointer-based data structures, allowing for these traversals to be efficiently streamed into the FPGA. Although the cache is generally limited to improving applications that exhibit repeated traversals, we show that many applications in fact have this characteristic. Furthermore, we show that few repetitions are needed to achieve performance improvements. We present experimental results showing that FPGA implementations using the traversal cache framework achieve speedups ranging from 7x to 29x compared to pointer-based software on a 3.2 GHz Xeon.","PeriodicalId":300268,"journal":{"name":"International Conference on Hardware/Software Codesign and System Synthesis","volume":"90 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2008-10-19","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"134420644","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 15

SPaC: a symbolic pareto calculator 符号帕累托计算器

International Conference on Hardware/Software Codesign and System Synthesis Pub Date : 2008-10-19 DOI: 10.1145/1450135.1450176

H. Shojaei, T. Basten, M. Geilen, Phillip Stanley-Marbell

引用次数: 6

Software optimization for MPSoC: a mpeg-2 decoder case study 软件优化的MPSoC: mpeg-2解码器的案例研究

International Conference on Hardware/Software Codesign and System Synthesis Pub Date : 2008-10-19 DOI: 10.1145/1450135.1450146

Eric Cheung, H. Hsieh, F. Balarin

引用次数: 0

Dynamic tuning of configurable architectures: the AWW online algorithm 可配置架构的动态调优:AWW在线算法

International Conference on Hardware/Software Codesign and System Synthesis Pub Date : 2008-10-19 DOI: 10.1145/1450135.1450158

Chen-Chun Huang, David Sheldon, F. Vahid

引用次数: 10

Specification-based compaction of directed tests for functional validation of pipelined processors 基于规范的定向测试压缩，用于流水线处理器的功能验证

International Conference on Hardware/Software Codesign and System Synthesis Pub Date : 2008-10-19 DOI: 10.1145/1450135.1450167

Heon-Mo Koo, P. Mishra

引用次数: 8

You can catch more bugs with transaction level honey 你可以用事务级蜂蜜捕获更多的bug

International Conference on Hardware/Software Codesign and System Synthesis Pub Date : 2008-10-19 DOI: 10.1145/1450135.1450163

M. Abramovici, K. Goossens, B. Vermeulen, J. Greenbaum, N. Stollon, A. Donlin

引用次数: 7

Guaranteed scheduling for repetitive hard real-time tasks under the maximal temperature constraint 最大温度约束下重复性硬实时任务的保证调度

International Conference on Hardware/Software Codesign and System Synthesis Pub Date : 2008-10-19 DOI: 10.1145/1450135.1450196

Gang Quan, Yan Zhang, William Wiles, Pei Pei

引用次数: 39

Asynchronous transient resilient links for NoC NoC异步瞬态弹性链路

International Conference on Hardware/Software Codesign and System Synthesis Pub Date : 2008-10-19 DOI: 10.1145/1450135.1450182

S. Ogg, B. Al-Hashimi, A. Yakovlev

引用次数: 22

Extending open core protocol to support system-level cache coherence 扩展开放核心协议以支持系统级缓存一致性

International Conference on Hardware/Software Codesign and System Synthesis Pub Date : 2008-10-19 DOI: 10.1145/1450135.1450173

K. Aisopos, Chien-Chun Chou, L. Peh

引用次数: 8

Scratchpad allocation for concurrent embedded software 并发嵌入式软件的刮记板分配

International Conference on Hardware/Software Codesign and System Synthesis Pub Date : 2008-10-19 DOI: 10.1145/1450135.1450145

Vivy Suhendra, Abhik Roychoudhury, T. Mitra

{"title":"Scratchpad allocation for concurrent embedded software","authors":"Vivy Suhendra, Abhik Roychoudhury, T. Mitra","doi":"10.1145/1450135.1450145","DOIUrl":"https://doi.org/10.1145/1450135.1450145","url":null,"abstract":"Software-controlled scratchpad memory is increasingly employed in embedded systems as it offers better timing predictability compared to caches. Previous scratchpad allocation algorithms typically consider single process applications. But embedded applications are mostly multi-tasking with real-time constraints, where the scratchpad memory space has to be shared among interacting processes that may preempt each other. In this paper, we develop a novel dynamic scratchpad allocation technique that takes these process interferences into account to improve the performance and predictability of the memory system. We model the application as a Message Sequence Chart (MSC) to best capture the interprocess interactions. Our goal is to optimize the worst-case response time (WCRT) of the application through runtime reloading of the scratchpad memory content at appropriate execution points. We propose an iterative allocation algorithm that consists of two critical steps: (1) analyze the MSC along with the existing allocation to determine potential interference patterns, and (2) exploit this interference information to tune the scratchpad reloading points and content so as to best improve the WCRT. We evaluate our memory allocation scheme on a real-world embedded application controlling an Unmanned Aerial Vehicle (UAV).","PeriodicalId":300268,"journal":{"name":"International Conference on Hardware/Software Codesign and System Synthesis","volume":"62 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2008-10-19","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"122249039","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 8