ASAP 2010 - 21st IEEE International Conference on Application-specific Systems, Architectures and Processors最新文献_第2页

A formal specification of fault-tolerance in prospecting asteroid mission with Reactive Autonomie Systems Framework 基于反应自主系统框架的小行星勘探任务容错规范

ASAP 2010 - 21st IEEE International Conference on Application-specific Systems, Architectures and Processors Pub Date : 2010-07-07 DOI: 10.1109/ASAP.2010.5540769

Heng Kuang, O. Ormandjieva, S. Klasa, J. Bentahar

引用次数: 4

High parallel variation Banyan network based permutation network for reconfigurable LDPC decoder 基于Banyan网络的可重构LDPC解码器排列网络

ASAP 2010 - 21st IEEE International Conference on Application-specific Systems, Architectures and Processors Pub Date : 2010-07-07 DOI: 10.1109/ASAP.2010.5540964

Xiao Peng, Zhixiang Chen, Xiongxin Zhao, F. Maehara, S. Goto

引用次数: 14

Elliptic Curve point multiplication on GPUs 椭圆曲线点乘法在gpu上

ASAP 2010 - 21st IEEE International Conference on Application-specific Systems, Architectures and Processors Pub Date : 2010-07-07 DOI: 10.1109/ASAP.2010.5541000

S. Antão, J. Bajard, L. Sousa

引用次数: 42

Newton-Raphson algorithms for floating-point division using an FMA 使用FMA进行浮点除法的牛顿-拉夫逊算法

ASAP 2010 - 21st IEEE International Conference on Application-specific Systems, Architectures and Processors Pub Date : 2010-07-07 DOI: 10.1109/ASAP.2010.5540948

N. Louvet, J. Muller, A. Panhaleux

引用次数: 20

Implementation of binary edwards curves for very-constrained devices 非常受限器件的二元爱德华兹曲线的实现

ASAP 2010 - 21st IEEE International Conference on Application-specific Systems, Architectures and Processors Pub Date : 2010-07-07 DOI: 10.1109/ASAP.2010.5541003

Ünal Koçabas, Junfeng Fan, I. Verbauwhede

引用次数: 39

FPGA-based lossless compressors of floating-point data streams to enhance memory bandwidth 基于fpga的浮点数据流无损压缩器，以增强内存带宽

ASAP 2010 - 21st IEEE International Conference on Application-specific Systems, Architectures and Processors Pub Date : 2010-07-07 DOI: 10.1109/ASAP.2010.5540973

Kazuya Katahira, K. Sano, S. Yamamoto

引用次数: 15

On energy efficiency of reconfigurable systems with run-time partial reconfiguration 运行时部分重构可重构系统的能源效率研究

ASAP 2010 - 21st IEEE International Conference on Application-specific Systems, Architectures and Processors Pub Date : 2010-07-07 DOI: 10.1109/ASAP.2010.5540985

Shaoshan Liu, Richard Neil Pittman, Alessandro Form, J. Gaudiot

引用次数: 26

Customizing controller instruction sets for application-specific architectures 为特定于应用程序的体系结构定制控制器指令集

ASAP 2010 - 21st IEEE International Conference on Application-specific Systems, Architectures and Processors Pub Date : 2010-07-07 DOI: 10.1109/ASAP.2010.5540965

Jian Li, David Dickin, Lesley Shannon

引用次数: 0

An FPGA-specific algorithm for direct generation of multi-variate Gaussian random numbers 直接生成多变量高斯随机数的fpga专用算法

ASAP 2010 - 21st IEEE International Conference on Application-specific Systems, Architectures and Processors Pub Date : 2010-07-07 DOI: 10.1109/ASAP.2010.5541005

David B. Thomas, W. Luk

{"title":"An FPGA-specific algorithm for direct generation of multi-variate Gaussian random numbers","authors":"David B. Thomas, W. Luk","doi":"10.1109/ASAP.2010.5541005","DOIUrl":"https://doi.org/10.1109/ASAP.2010.5541005","url":null,"abstract":"The multi-variate Gaussian distribution is used to model random processes with distinct pair-wise correlations, such as stock prices that tend to rise and fall together. Multi-variate Gaussian vectors with length n are usually produced by first generating a vector of n independent Gaussian samples, then multiplying with a correlation inducing matrix requiring 0(n2) multiplications. This paper presents a method of generating vectors directly from the uniform distribution, removing the need for an expensive scalar Gaussian generator, and eliminating the need for any multipliers. The method relies only on small ROMs and adders, and so can be implemented using just logic resources (LUTs and FFs), saving DSP and block-RAM resources for the numerical simulation that the multi-variate generator is driving. The new method provides a ten times increase in raw performance over the fastest existing FPGA generation method, and also provides a five times improvement in performance per resource over the most efficient existing method. Using this method a single 400MHz Virtex-5 FPGA can generate vectors ten times faster than an optimised CUDA implementation on a 1.2GHz GPU, and a hundred times faster than SIMD optimised software on a quad core 2.2GHz CPU.","PeriodicalId":175846,"journal":{"name":"ASAP 2010 - 21st IEEE International Conference on Application-specific Systems, Architectures and Processors","volume":"33 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2010-07-07","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"115083340","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 7

Completeness of automatically generated instruction selectors 自动生成指令选择器的完整性

ASAP 2010 - 21st IEEE International Conference on Application-specific Systems, Architectures and Processors Pub Date : 2010-07-07 DOI: 10.1109/ASAP.2010.5540994

F. Brandner

{"title":"Completeness of automatically generated instruction selectors","authors":"F. Brandner","doi":"10.1109/ASAP.2010.5540994","DOIUrl":"https://doi.org/10.1109/ASAP.2010.5540994","url":null,"abstract":"The use of tree pattern matching for instruction selection has proven very successful in modern compilers. This can be attributed to the declarative nature of tree grammar specifications, which greatly simplifies the development of fast high-quality code generators. The approach has also been adopted widely by generator tools that aim to automatically extract the instruction selector, as well as other compiler components, for application-specific instruction processors from generic processor models. A major advantage of tree pattern matching is that it is suitable for static analysis and allows to verify properties of a given specification. Completeness is an important example of such a property, in particular for automatically generated compilers. Tree automata can be used to prove that a given instruction selector specification is complete, i.e., can actually generate machine code for all possible input programs. Traditional approaches for completeness tests cannot represent dynamic checks that may disable certain matching rules during code generation. However, these dynamic checks occur very frequently in compilers targeting application-specific processors. The dynamic checks arise from hidden properties that are not captured by the terminal symbols of the tree grammar notation. We apply terminal splitting to the instruction selector specifications that are automatically derived from structural processor models to make these properties explicit. The transformed specification is then verified using a traditional completeness test. If the test fails, counter examples are presented that allow to adopt the compiler or extend the processor model accordingly.","PeriodicalId":175846,"journal":{"name":"ASAP 2010 - 21st IEEE International Conference on Application-specific Systems, Architectures and Processors","volume":"1 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2010-07-07","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"129864461","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 5