2010 IEEE 8th Symposium on Application Specific Processors (SASP)最新文献_第2页

A Coarse Grain Reconfigurable Architecture for sequence alignment problems in bio-informatics 生物信息学中序列比对问题的粗粒可重构结构

2010 IEEE 8th Symposium on Application Specific Processors (SASP) Pub Date : 2010-06-13 DOI: 10.1109/SASP.2010.5521146

Pei Liu, A. Hemani

{"title":"A Coarse Grain Reconfigurable Architecture for sequence alignment problems in bio-informatics","authors":"Pei Liu, A. Hemani","doi":"10.1109/SASP.2010.5521146","DOIUrl":"https://doi.org/10.1109/SASP.2010.5521146","url":null,"abstract":"A Coarse Grain Reconfigurable Architecture (CGRA) tailored for accelerating bio-informatics algorithms is proposed. The key innovation is a light weight bio-informatics processor that can be reconfigured to perform different Add Compare and Select operations of the popular sequencing algorithms. A programmable and scalable architectural platform instantiates an array of such processing elements and allows arbitrary partitioning and scheduling schemes and capable of solving complete sequencing algorithms including the sequential phases and deal with arbitrarily large sequences. The key difference of the proposed CGRA based solution compared to FPGA and GPU based solutions is a much better match of the architecture and algorithm for the core computational need as well as the system level architectural need. This claim is quantified for three popular sequencing algorithms: the Needleman-Wunsch, Smith-Waterman and HMMER. For the same degree of parallelism, we provide a 5 X and 15 X speed-up improvements compared to FPGA and GPU respectively. For the same size of silicon, the advantage grows by a factor of another 10 X.","PeriodicalId":119893,"journal":{"name":"2010 IEEE 8th Symposium on Application Specific Processors (SASP)","volume":"16 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2010-06-13","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"121296176","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 4

Customized architectures for faster route finding in GPS-based navigation systems 在基于gps的导航系统中更快地找到路线的定制架构

2010 IEEE 8th Symposium on Application Specific Processors (SASP) Pub Date : 2010-06-13 DOI: 10.1109/SASP.2010.5521148

Jason Loew, D. Ponomarev, P. Madden

引用次数: 2

A hardware pipeline for accelerating ray traversal algorithms on streaming processors 在流处理器上加速射线遍历算法的硬件管道

2010 IEEE 8th Symposium on Application Specific Processors (SASP) Pub Date : 2010-06-13 DOI: 10.1109/SASP.2010.5521150

Michael Steffen, Joseph Zambreno

{"title":"A hardware pipeline for accelerating ray traversal algorithms on streaming processors","authors":"Michael Steffen, Joseph Zambreno","doi":"10.1109/SASP.2010.5521150","DOIUrl":"https://doi.org/10.1109/SASP.2010.5521150","url":null,"abstract":"Ray Tracing is a graphics rendering method that uses rays to trace the path of light in a computer model. To accelerate the processing of rays, scenes are typically compiled into smaller spatial boxes using a tree structure and rays then traverse the tree structure to determine relevant spatial boxes. This allows computations involving rays and scene objects to be limited to only objects close to the ray and does not require processing all elements in the computer model. We present a ray traversal pipeline designed to accelerate ray tracing traversal algorithms using a combination of currently used programmable graphics processors and a new fixed hardware pipeline. Our fixed hardware pipeline performs an initial traversal operation that quickly identifies a smaller sized, fixed granularity spatial bounding box from the original scene. This spatial box can then be traversed further to identify subsequently smaller spatial bounding boxes using any user-defined acceleration algorithm. We show that our pipeline allows for an expected level of user programmability, including development of custom data structures, and can support a wide range of processor architectures. The performance of our pipeline is evaluated for ray traversal and intersection stages using a kd-tree ray tracing algorithm and a custom simulator modeling a generic streaming processor architecture. Experimental results show that our pipeline reduces the number of executed instructions on a graphics processor for the traversal operation by 2.15X for visible rays. The memory bandwidth required for traversal is also reduced by a factor of 1.3X for visible rays.","PeriodicalId":119893,"journal":{"name":"2010 IEEE 8th Symposium on Application Specific Processors (SASP)","volume":"16 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2010-06-13","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"116997763","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 4

CMA: Chip multi-accelerator CMA:芯片多加速器

2010 IEEE 8th Symposium on Application Specific Processors (SASP) Pub Date : 2010-06-13 DOI: 10.1109/SASP.2010.5521152

Dominik Auras, Sylvain Girbal, H. Berry, O. Temam, S. Yehia

{"title":"CMA: Chip multi-accelerator","authors":"Dominik Auras, Sylvain Girbal, H. Berry, O. Temam, S. Yehia","doi":"10.1109/SASP.2010.5521152","DOIUrl":"https://doi.org/10.1109/SASP.2010.5521152","url":null,"abstract":"Custom acceleration has been a standard choice in embedded systems thanks to the power density and performance efficiency it provides. Parallelism is another orthogonal scalability path that efficiently overcomes the increasing limitation of frequency scaling in current general-purpose architectures. In this paper we propose a multi-accelerator architecture that combines the best of both worlds, parallelism and custom acceleration, while addressing the programmability inconvenience of heterogeneous multiprocessing systems. A Chip Multi-Accelerator (CMA) is a regular parallel architecture where each core is complemented with a custom accelerator to speed up specific functions. Furthermore, by using techniques to efficiently merge more than one custom accelerator together, we are able to cram as many accelerators as needed by the application or a domain of applications. We demonstrate our approach on a Software Defined Radio (SDR) case study. We show that starting from a baseline description of several SDR waveforms and candidate tasks for acceleration, we are able to map the different waveforms on the heterogeneous multi-accelerator architecture while keeping a logical view of a regular multi-core architecture, thus simplifying the mapping of the waveforms onto the multi-accelerator.","PeriodicalId":119893,"journal":{"name":"2010 IEEE 8th Symposium on Application Specific Processors (SASP)","volume":"20 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2010-06-13","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"114304128","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 10

Accelerating DNA analysis applications on GPU clusters 加速GPU集群上的DNA分析应用

2010 IEEE 8th Symposium on Application Specific Processors (SASP) Pub Date : 2010-06-13 DOI: 10.1109/SASP.2010.5521145

Antonino Tumeo, Oreste Villa

引用次数: 41

Design of a custom VEE core in a chip multiprocessor 芯片多处理器中定制VEE核心的设计

2010 IEEE 8th Symposium on Application Specific Processors (SASP) Pub Date : 2010-06-13 DOI: 10.1109/SASP.2010.5521138

Dan Upton, K. Hazelwood

引用次数: 0

I-cache configurability for temperature reduction through replicated cache partitioning 通过复制缓存分区降低温度的I-cache可配置性

2010 IEEE 8th Symposium on Application Specific Processors (SASP) Pub Date : 2010-06-13 DOI: 10.1109/SASP.2010.5521143

M. Paul, Peter Petrov

{"title":"I-cache configurability for temperature reduction through replicated cache partitioning","authors":"M. Paul, Peter Petrov","doi":"10.1109/SASP.2010.5521143","DOIUrl":"https://doi.org/10.1109/SASP.2010.5521143","url":null,"abstract":"On-chip caches have been known to be a major contributor to leakage power as they occupy a sizable fraction of the chip's real estate and as such have been the target of power optimization techniques. However, many of these techniques do not consider the effects of temperature on leakage power and can hence be suboptimal since leakage power rises rapidly with temperature. When large fractions of the cache are disabled and only a small partition is used, the power density increases significantly which leads to increased temperature and leakage. We propose a temperature reduction methodology that leverages recently introduced configurable caches, in order to not only assign to the task a cache partition commensurate to its current demand but also to minimize the associated power density and temperature. In order to counteract the effect of elevated power density and achieve temperature reductions, in the proposed technique each such cache partition is replicated and only one of the replicas is active at any time. The inactive partition replicas are placed into a low-power drowsy mode while the primary partition services the task's instruction requests. By periodically switching the tasks association between replica cache partitions, the power density and hence the temperature are reduced.","PeriodicalId":119893,"journal":{"name":"2010 IEEE 8th Symposium on Application Specific Processors (SASP)","volume":"38 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2010-06-13","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"134622081","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 1

Efficient design and generation of a multi-facet arbiter 多面仲裁器的高效设计与生成

2010 IEEE 8th Symposium on Application Specific Processors (SASP) Pub Date : 2010-06-13 DOI: 10.1109/SASP.2010.5521137

J. Jou, Yun-Lung Lee, Sih-Sian Wu

引用次数: 3

A dynamically reconfigurable asynchronous processor 动态可重构的异步处理器

2010 IEEE 8th Symposium on Application Specific Processors (SASP) Pub Date : 2010-06-13 DOI: 10.1109/SASP.2010.5521141

Khodor Ahmad Fawaz, T. Arslan, S. Khawam, M. Muir, I. Nousias, Iain A. B. Lindsay, A. Erdogan

引用次数: 3

FPGA and GPU implementation of large scale SpMV 大规模SpMV的FPGA和GPU实现

2010 IEEE 8th Symposium on Application Specific Processors (SASP) Pub Date : 2010-06-01 DOI: 10.1109/SASP.2010.5521144

Yi Shan, Tianji Wu, Yu Wang, Bo Wang, Zilong Wang, Ningyi Xu, Huazhong Yang

引用次数: 31