ASAP 2010 - 21st IEEE International Conference on Application-specific Systems, Architectures and Processors最新文献_第3页

Dependability analysis of a countermeasure against fault attacks by means of laser shots onto a SRAM-based FPGA 针对基于sram的FPGA激光攻击的可靠性分析

ASAP 2010 - 21st IEEE International Conference on Application-specific Systems, Architectures and Processors Pub Date : 2010-07-07 DOI: 10.1109/ASAP.2010.5540765

G. Canivet, P. Maistri, R. Leveugle, F. Valette, J. Clédière, M. Renaudin

引用次数: 5

Hardware-assisted middleware: Acceleration of garbage collection operations 硬件辅助中间件:加速垃圾收集操作

ASAP 2010 - 21st IEEE International Conference on Application-specific Systems, Architectures and Processors Pub Date : 2010-07-07 DOI: 10.1109/ASAP.2010.5541011

Jie Tang, Shaoshan Liu, Zhimin Gu, Xiao-Feng Li, J. Gaudiot

引用次数: 16

Area optimized H.264 Intra prediction architecture for 1080p HD resolution 针对1080p高清分辨率的区域优化H.264 Intra预测架构

ASAP 2010 - 21st IEEE International Conference on Application-specific Systems, Architectures and Processors Pub Date : 2010-07-07 DOI: 10.1109/ASAP.2010.5540989

Jimit Shah, K. S. Raghunandan, Kuruvilla Varghese

引用次数: 0

A high efficient memory architecture for H.264/AVC motion compensation 一种用于H.264/AVC运动补偿的高效存储器结构

ASAP 2010 - 21st IEEE International Conference on Application-specific Systems, Architectures and Processors Pub Date : 2010-07-07 DOI: 10.1109/ASAP.2010.5540963

Chunshu Li, Kai Huang, Xiaolang Yan, Jiong Feng, De Ma, Haitong Ge

引用次数: 4

Combined scheduling and instruction selection for processors with reconfigurable cell fabric 具有可重构单元结构的处理器的组合调度和指令选择

ASAP 2010 - 21st IEEE International Conference on Application-specific Systems, Architectures and Processors Pub Date : 2010-07-07 DOI: 10.1109/ASAP.2010.5540997

Antoine Floch, C. Wolinski, K. Kuchcinski

引用次数: 14

Highly efficient mapping of the Smith-Waterman algorithm on CUDA-compatible GPUs Smith-Waterman算法在cuda兼容gpu上的高效映射

ASAP 2010 - 21st IEEE International Conference on Application-specific Systems, Architectures and Processors Pub Date : 2010-07-07 DOI: 10.1109/ASAP.2010.5540796

Keisuke Dohi, K. Benkrid, Cheng Ling, T. Hamada, Yuichiro Shibata

{"title":"Highly efficient mapping of the Smith-Waterman algorithm on CUDA-compatible GPUs","authors":"Keisuke Dohi, K. Benkrid, Cheng Ling, T. Hamada, Yuichiro Shibata","doi":"10.1109/ASAP.2010.5540796","DOIUrl":"https://doi.org/10.1109/ASAP.2010.5540796","url":null,"abstract":"This paper describes a multi-threaded parallel design and implementation of the Smith-Waterman (SW) algorithm on graphic processing units (GPUs) with NVIDIA corporation's Compute Unified Device Architecture (CUDA). Central to this is a divide and conquer approach which divides the computation of a whole pairwise sequence alignment matrix into multiple sub-matrices (or parallelograms) each running efficiently on the available hardware resources of the GPU in hand, with temporary intermediate data stored in global memory. Moreover, we use thread warps and padding techniques in order to decrease the cost of thread synchronization, as well as loop unrolling in order to reduce the cost of conditional branches. While intermediate data is stored in global memory for large queries, the most inner loop in our implementation will only access shared memory and registers. As a result of these optimizations, our implementation of the SW algorithm achieves a throughput ranging between 9.09 GCUPS (Giga Cell Update per Second) and 12.71 GCUPS on a single-GPU version, and a throughput between 29.46 GCUPS and 43.05 GCUPS on a quad-GPU platform. Compared with the best GPU implementation of the SW algorithm reported to date, our implementation achieves up to 46 % improvement in speed. The source code of our implementation is available in the public domain for Bioinformaticians to benefit from its performance.","PeriodicalId":175846,"journal":{"name":"ASAP 2010 - 21st IEEE International Conference on Application-specific Systems, Architectures and Processors","volume":"38 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2010-07-07","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"125061474","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 25

Function flattening for lease-based, information-leak-free systems 功能扁平化为基于租赁，信息泄漏无系统

ASAP 2010 - 21st IEEE International Conference on Application-specific Systems, Architectures and Processors Pub Date : 2010-07-07 DOI: 10.1109/ASAP.2010.5540946

Xun Li, Mohit Tiwari, T. Sherwood, F. Chong

引用次数: 2

Using shared library interposing for transparent application acceleration in systems with heterogeneous hardware accelerators 在异构硬件加速器系统中使用共享库插入实现透明的应用程序加速

ASAP 2010 - 21st IEEE International Conference on Application-specific Systems, Architectures and Processors Pub Date : 2010-07-07 DOI: 10.1109/ASAP.2010.5540798

Tobias Beisel, Manuel Niekamp, Christian Plessl

{"title":"Using shared library interposing for transparent application acceleration in systems with heterogeneous hardware accelerators","authors":"Tobias Beisel, Manuel Niekamp, Christian Plessl","doi":"10.1109/ASAP.2010.5540798","DOIUrl":"https://doi.org/10.1109/ASAP.2010.5540798","url":null,"abstract":"Todays computer systems increasingly comprise het-erogenous computing elements like multi-core processors, graphics processing units, and specialized co-processors, which allow parallel processing. Programming applications to utilize such systems is a complex process and needs good knowledge about the hardware architecture. Automatic and transparent use of these resources is a major concern of domain specific software developers and users. We present a new approach of using shared library interposing to replace libraries in binary applications with highly optimized accelerated versions. A plugin-based framework was developed, which allows interposing shared library calls, delegating them to accelerator specific libraries and adapting them to the library specific interface. Accelerator specific plugins can be added with a high degree of automatism. First steps were taken to develop a fast and intelligent selection component, choosing the best possible accelerator for a shared library call. It was shown, that such a framework may be efficiently used to apply shared library interposing to transparently speedup existing applications. The BLAS library for linear algebra was used as an example to develop plugins for an acceleratable library. Runtimes of BLAS functions were measured on different architectures and expose significant differences depending on the used implementation and hardware, showing the potentially high speedups of the approach.","PeriodicalId":175846,"journal":{"name":"ASAP 2010 - 21st IEEE International Conference on Application-specific Systems, Architectures and Processors","volume":"68 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2010-07-07","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"116227752","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 7

An optimized NoC architecture for accelerating TSP kernels in breakpoint median problem 断点中值问题中加速TSP核的优化NoC架构

ASAP 2010 - 21st IEEE International Conference on Application-specific Systems, Architectures and Processors Pub Date : 2010-07-07 DOI: 10.1109/ASAP.2010.5540797

Turbo Majumder, Souradip Sarkar, P. Pande, A. Kalyanaraman

{"title":"An optimized NoC architecture for accelerating TSP kernels in breakpoint median problem","authors":"Turbo Majumder, Souradip Sarkar, P. Pande, A. Kalyanaraman","doi":"10.1109/ASAP.2010.5540797","DOIUrl":"https://doi.org/10.1109/ASAP.2010.5540797","url":null,"abstract":"Traveling Salesman Problem (TSP) is a classical NP-complete problem in graph theory. It aims at finding a least-cost Hamiltonian cycle that traverses all vertices of an input edge-weighted graph. One application of TSP is in breakpoint median-based Maximum Parsimony phylogenetic tree reconstruction, wherein a bounded edge-weight model is used. Exponential algorithms that apply efficient heuristics, such as branch-and-bound, to dynamically prune the search space are used. We adopted this approach in an NoC-based implementation for solving TSP targeted towards phylogenetics taking advantage of the fine-grained parallelism and efficient communication network. The largest fraction of the solution time for TSP is accounted for by a particular lower bound calculation operation that uses the graph's adjacency matrix. In this paper, we present the design and implementation of the processing elements with a highly optimized lower bound computation kernel and evaluate its performance. Additionally, we explore two major NoC architectures -mesh and quad-tree - and show that the latter is more suitable for this application domain.","PeriodicalId":175846,"journal":{"name":"ASAP 2010 - 21st IEEE International Conference on Application-specific Systems, Architectures and Processors","volume":"32 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2010-07-07","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"133070629","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 5

A Bayesian network-based framework with Constraint Satisfaction Problem (CSP) formulations for FPGA system design 基于贝叶斯网络的FPGA系统设计框架与约束满足问题(CSP)公式

ASAP 2010 - 21st IEEE International Conference on Application-specific Systems, Architectures and Processors Pub Date : 2010-07-07 DOI: 10.1109/ASAP.2010.5540784

A. Azman, A. Bigdeli, Yasir Mohd-Mustafah, M. Biglari-Abhari, B. Lovell

引用次数: 0