Proceedings The International Conference on Application Specific Array Processors最新文献

筛选
英文 中文
Design of a systolic coprocessor for rational addition 理性加法收缩协处理器的设计
Proceedings The International Conference on Application Specific Array Processors Pub Date : 1995-07-24 DOI: 10.1109/ASAP.1995.522932
T. Jebelean
{"title":"Design of a systolic coprocessor for rational addition","authors":"T. Jebelean","doi":"10.1109/ASAP.1995.522932","DOIUrl":"https://doi.org/10.1109/ASAP.1995.522932","url":null,"abstract":"We design a systolic coprocessor for the addition of signed normalized rational numbers. This is the most complicated rational operation: it involves GCD, exact division, multiplication and addition/subtraction. In particular the implementation of GCD and exact division improve significantly (2 to 4 times) previously known solutions. In contrast to the traditional approach, all operations are performed least-significant digits first. This allows bit-pipelining between partial operations at reduced area-cost. An Atmel FPGA design for 8-bit operands consumes 730 cells (3,500 equivalent gates) and runs at 25 MHz (5 MHz after layout). For 32-bit operands this would be in the same timing range as the software solutions, however a significant speed-up can be expected for longer operands because the linear time-complexity of the hardware algorithms.","PeriodicalId":354358,"journal":{"name":"Proceedings The International Conference on Application Specific Array Processors","volume":"39 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"1995-07-24","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"122739603","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 2
A parallelizing compilation method for the map-oriented machine 面向地图机的并行编译方法
Proceedings The International Conference on Application Specific Array Processors Pub Date : 1995-07-24 DOI: 10.1109/ASAP.1995.522914
R. Hartenstein, J. Becker, R. Kress, H. Reinig, K. Schmidt
{"title":"A parallelizing compilation method for the map-oriented machine","authors":"R. Hartenstein, J. Becker, R. Kress, H. Reinig, K. Schmidt","doi":"10.1109/ASAP.1995.522914","DOIUrl":"https://doi.org/10.1109/ASAP.1995.522914","url":null,"abstract":"The paper introduces a novel parallelizing compilation method for the MoM. The MoM (Map-oriented Machine) is an Xputer architecture featuring multiple data sequencers and \"soft ALUs\". The compiler accepts C-source, which are restructured and partitioned into structural and sequential code providing parallelism at expression and statement level.","PeriodicalId":354358,"journal":{"name":"Proceedings The International Conference on Application Specific Array Processors","volume":"1 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"1995-07-24","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"129193712","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 1
A simple array processor for binary prefix sums 一个用于二进制前缀和的简单数组处理器
Proceedings The International Conference on Application Specific Array Processors Pub Date : 1995-07-24 DOI: 10.1109/ASAP.1995.522911
R. Lin, S. Olariu
{"title":"A simple array processor for binary prefix sums","authors":"R. Lin, S. Olariu","doi":"10.1109/ASAP.1995.522911","DOIUrl":"https://doi.org/10.1109/ASAP.1995.522911","url":null,"abstract":"The task of computing the prefix sums of a binary sequence (BPS, for short) arises frequently in expression evaluation, data and storage compaction, routing, processor assignment, and operating system design. The main goal of this work is to propose an efficient special-purpose architecture for the BPS problem. Our design exploits a novel and elegant idea that allows us to considerably reduce the number of processors of the best-known design. The resulting design is simple and intuitive and scales easily to handle input sequences of various sizes.","PeriodicalId":354358,"journal":{"name":"Proceedings The International Conference on Application Specific Array Processors","volume":"10 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"1995-07-24","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"124940945","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
The naive execution of affine recurrence equations 仿射递推方程的朴素执行
Proceedings The International Conference on Application Specific Array Processors Pub Date : 1995-07-24 DOI: 10.1109/ASAP.1995.522900
D. Wilde, S. Rajopadhye
{"title":"The naive execution of affine recurrence equations","authors":"D. Wilde, S. Rajopadhye","doi":"10.1109/ASAP.1995.522900","DOIUrl":"https://doi.org/10.1109/ASAP.1995.522900","url":null,"abstract":"In recognition of the fundamental relation between regular arrays and systems of affine recurrence equations, the ALPHA language was developed as the basis of a computer aided design methodology for regular array architectures. ALPHA is used to initially specify algorithms at a very high algorithmic level. Regular array architectures can then be derived from the algorithmic specification using a transformational approach supported by the ALPHA environment. This design methodology guarantees the final design to be correct by construction, assuming the initial algorithm was correct. In this paper, we address the problem of validating an initial specification. We demonstrate a translation methodology which compiles ALPHA into the imperative sequential language C. The C-code may then be compiled and executed to test the specification. We show how an ALPHA program can be naively implemented by viewing it as a set of monolithic arrays and their filing functions, implemented using applicative caching. This is the approach which is used by the translator. We discuss two problems that had to be solved before implementing the translator. The first is how to allocate 1-dimensional storage for a polyhedron, and the second is how to scan a polyhedron with nested loops.","PeriodicalId":354358,"journal":{"name":"Proceedings The International Conference on Application Specific Array Processors","volume":"151 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"1995-07-24","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"114378733","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 9
Minimizing synchronization overhead in statically scheduled multiprocessor systems 在静态调度的多处理器系统中最小化同步开销
Proceedings The International Conference on Application Specific Array Processors Pub Date : 1995-07-24 DOI: 10.1109/ASAP.1995.522934
S. Bhattacharyya, S. Sriram, Edward A. Lee
{"title":"Minimizing synchronization overhead in statically scheduled multiprocessor systems","authors":"S. Bhattacharyya, S. Sriram, Edward A. Lee","doi":"10.1109/ASAP.1995.522934","DOIUrl":"https://doi.org/10.1109/ASAP.1995.522934","url":null,"abstract":"Synchronization overhead can significantly degrade performance in embedded multiprocessor systems. This paper develops techniques to determine a minimal set of processor synchronizations that are essential for correct execution in an embedded multiprocessor implementation. Our study is based in the context of self-timed execution of iterative dataflow programs; dataflow programming in this form has been applied extensively, particularly in the context of signal processing software. Self-timed execution refers to a combined compile-time/run-time scheduling strategy in which processors synchronize with one another only based on inter-processor communication requirements, and thus, synchronization of processors at the end of each loop iteration does not generally occur. We introduce a new graph-theoretic framework, based on a data structure called the synchronization graph, for analyzing and optimizing synchronization overhead in self-timed, iterative dataflow programs. We also present an optimization that involves converting a synchronization graph that is not strongly connected into a strongly connected graph.","PeriodicalId":354358,"journal":{"name":"Proceedings The International Conference on Application Specific Array Processors","volume":"53 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"1995-07-24","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"125642153","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 22
Column compression pipelined multipliers 列压缩流水线乘法器
Proceedings The International Conference on Application Specific Array Processors Pub Date : 1995-07-24 DOI: 10.1109/ASAP.1995.522909
L. Breveglieri, L. Dadda, V. Piuri
{"title":"Column compression pipelined multipliers","authors":"L. Breveglieri, L. Dadda, V. Piuri","doi":"10.1109/ASAP.1995.522909","DOIUrl":"https://doi.org/10.1109/ASAP.1995.522909","url":null,"abstract":"The paper presents a study on the introduction of pipelining in parallel VLSI multipliers, built according to the column compression (CC) design techniques. A number of CC multiplier schemes have been proposed in the literature, aimed at reducing the number of stages of adders necessary to compute a multiplication. More recently CC multiplier schemes aimed at optimising the required silicon area, the regularity and the locality of the interconnections among the adders, have been proposed. The paper affords the introduction of pipelining in these last structures and compares the obtained results with existing structures, in terms of required number of components and operation frequency.","PeriodicalId":354358,"journal":{"name":"Proceedings The International Conference on Application Specific Array Processors","volume":"26 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"1995-07-24","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"134512726","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 3
VLSI algorithms for compressed pattern search using tree based codes 使用基于树的代码压缩模式搜索的VLSI算法
Proceedings The International Conference on Application Specific Array Processors Pub Date : 1995-07-24 DOI: 10.1109/ASAP.1995.522915
A. Mukherjee, T. Acharya
{"title":"VLSI algorithms for compressed pattern search using tree based codes","authors":"A. Mukherjee, T. Acharya","doi":"10.1109/ASAP.1995.522915","DOIUrl":"https://doi.org/10.1109/ASAP.1995.522915","url":null,"abstract":"Data compression methods are used to reduce the redundancy in data representation in order to decrease the data storage requirements and communication costs. In order to exploit the benefits of data compression to conserve internal processor storage and computation resources, it is desirable to perform operations on compressed data without decompressing it. We present hardware algorithms and VLSI implementation of a chip to search a compressed text with respect to keys or patterns in compressed form using Huffman-type tree-based codes.","PeriodicalId":354358,"journal":{"name":"Proceedings The International Conference on Application Specific Array Processors","volume":"1 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"1995-07-24","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"132484371","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 5
Input buffering requirements of a systolic array for the inverse discrete wavelet transform 离散逆小波变换收缩阵列的输入缓冲要求
Proceedings The International Conference on Application Specific Array Processors Pub Date : 1995-07-24 DOI: 10.1109/ASAP.1995.522920
R. Lang, A. Spray
{"title":"Input buffering requirements of a systolic array for the inverse discrete wavelet transform","authors":"R. Lang, A. Spray","doi":"10.1109/ASAP.1995.522920","DOIUrl":"https://doi.org/10.1109/ASAP.1995.522920","url":null,"abstract":"The Discrete Wavelet Transform (DWT) is a signal processing technique popularised by its results in data compression. Considerable work has been done in designing novel architectures to perform the DWT, including a systolic architecture designed by the authors, but little attention has been given to the inverse DWT which is needed in applications such as data compression for signal reconstruction. Despite the fact that the inverse DWT is computationally the reverse of the DWT, the hardware design for the architecture is not simply mirrored. Existing designs expect the architecture for the inverse DWT to be a simple follow-on step from the DWT design, however this is not the case. We present one such problem here, showing the FIFO buffering required on the input of the inverse architecture. We show how the size of this buffer can be calculated and compare it to a fixed data array implementation. This work is based on our systolic array design and is an integral part of the inverse DWT design we are working on for image and video compression.","PeriodicalId":354358,"journal":{"name":"Proceedings The International Conference on Application Specific Array Processors","volume":"45 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"1995-07-24","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"125478391","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 4
0
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
确定
请完成安全验证×
相关产品
×
本文献相关产品
联系我们:info@booksci.cn Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。 Copyright © 2023 布克学术 All rights reserved.
京ICP备2023020795号-1
ghs 京公网安备 11010802042870号
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术官方微信