Proceedings of 3rd International Conference on Algorithms and Architectures for Parallel Processing最新文献

筛选
英文 中文
Special purpose neurocomputers: an automatic design approach 专用神经计算机:一种自动设计方法
A. Basaglia, W. Fornaciari, F. Salice
{"title":"Special purpose neurocomputers: an automatic design approach","authors":"A. Basaglia, W. Fornaciari, F. Salice","doi":"10.1109/ICAPP.1997.651532","DOIUrl":"https://doi.org/10.1109/ICAPP.1997.651532","url":null,"abstract":"A methodology to design a digital special purpose neurocomputer implementing feedforward multilayer neural networks is presented. The design flow consists of three stages: the weight discretization, which relaxes the precision requirements maintaining the compatibility with the original model; the architectural synthesis, which transforms the abstract description into an optimized digital structure; and the VHDL model generation, which produces the VHDL description of the general purpose neurocomputer by using a set of parametric components.","PeriodicalId":325978,"journal":{"name":"Proceedings of 3rd International Conference on Algorithms and Architectures for Parallel Processing","volume":"37 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"1997-12-10","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"124962232","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 2
Update based distributed shared memory integrated into RHODOS' memory management 基于更新的分布式共享内存集成到RHODOS的内存管理中
J. Silcock, A. Gościński
{"title":"Update based distributed shared memory integrated into RHODOS' memory management","authors":"J. Silcock, A. Gościński","doi":"10.1109/ICAPP.1997.651494","DOIUrl":"https://doi.org/10.1109/ICAPP.1997.651494","url":null,"abstract":"The DSM system we propose in this paper is implemented completely at the operating system level as a component of RHODOS' Memory (Space) Manager. In addition, it is integrated with RHODOS' existing invalidation-based DSM allowing the programmers to choose the consistency protocol best suited to their application. These factors enable RHODOS DSM to provide the user with a transparent, efficient and scalable shared memory programming environment. In this paper, we describe the logical design, implementation and performance study of an update based DSM which strictly adheres to the above criteria. These criteria allow the user to program using a familiar model while taking advantage of the greater scalability of COWs.","PeriodicalId":325978,"journal":{"name":"Proceedings of 3rd International Conference on Algorithms and Architectures for Parallel Processing","volume":"32 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"1997-12-10","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"129855328","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 14
Modeling and evaluation of a new cluster-based system for commercial applications 基于商业应用的新型集群系统建模与评估
W. Hahn, Suk-Han Yoon, Kangwoo Lee, M. Dubois
{"title":"Modeling and evaluation of a new cluster-based system for commercial applications","authors":"W. Hahn, Suk-Han Yoon, Kangwoo Lee, M. Dubois","doi":"10.1109/ICAPP.1997.651489","DOIUrl":"https://doi.org/10.1109/ICAPP.1997.651489","url":null,"abstract":"We model and evaluate a new parallel processing system for commercial applications, so called SPAX. SPAX cost-effectively overcomes the SMP limitation by providing both scalability of the parallel processing system and application portability of the SMP. To investigate whether the new architecture satisfies the requirements of commercial applications, such as OLTP, we have built the system and workload model. The results of the simulation show that the IO subsystem becomes the bottleneck before the newly developed system network. We find that SPAX can still meet the IO requirement of the OLTP workload as its network and IO node support the flexible IO subsystem, in terms of the number of disk drives and IO nodes versus that of processing nodes.","PeriodicalId":325978,"journal":{"name":"Proceedings of 3rd International Conference on Algorithms and Architectures for Parallel Processing","volume":"17 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"1997-12-10","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"127589587","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Subtorii allocation strategies for torus connected networks 环面连通网络的子域分配策略
S. Gupta, P. Srimani
{"title":"Subtorii allocation strategies for torus connected networks","authors":"S. Gupta, P. Srimani","doi":"10.1109/ICAPP.1997.651498","DOIUrl":"https://doi.org/10.1109/ICAPP.1997.651498","url":null,"abstract":"In this paper we investigate the problem of how to schedule n independent jobs on an m/spl times/m torus based network. We develop a model to quantify the effect of contention for communication links on the dilation of job execution time when multiple jobs share communication links.","PeriodicalId":325978,"journal":{"name":"Proceedings of 3rd International Conference on Algorithms and Architectures for Parallel Processing","volume":"90 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"1997-12-10","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"131320364","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 4
Generating efficient parallel code for successive over-relaxation 生成连续过松弛的高效并行代码
P. Tang
{"title":"Generating efficient parallel code for successive over-relaxation","authors":"P. Tang","doi":"10.1109/ICAPP.1997.651517","DOIUrl":"https://doi.org/10.1109/ICAPP.1997.651517","url":null,"abstract":"A complete suite of algorithms for parallelizing compilers to generate efficient SPMD code for SOR problems is presented. By applying unimodular transformation before loop tiling and parallelization, the number of messages per iteration per processor is reduced from 3/sup n/-1 in the conventional parallel SOR algorithm to 2/sup n/-1, where n is the dimensionality of the data set. To maintain the memory-scalability, a novel approach to use the local dynamic memory of parallel processors to implement the skewed data set is proposed.","PeriodicalId":325978,"journal":{"name":"Proceedings of 3rd International Conference on Algorithms and Architectures for Parallel Processing","volume":"1 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"1997-12-10","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"129754401","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
ATME: a parallel programming environment for applications with conditional task attributes ATME:用于具有条件任务属性的应用程序的并行编程环境
Lin Huang, M. Oudshoorn
{"title":"ATME: a parallel programming environment for applications with conditional task attributes","authors":"Lin Huang, M. Oudshoorn","doi":"10.1109/ICAPP.1997.651497","DOIUrl":"https://doi.org/10.1109/ICAPP.1997.651497","url":null,"abstract":"Parallel applications with inconstant usage patterns presents a big challenge to programmers in that the spawning of tasks and the communication between them may be conditional (named \"conditional parallel programming\"). Ideally, the programmer should not be burdened by operational issues which have little relationship to the application itself. This paper proposes a new parallel programming environment, ATME, to automate task scheduling in conditional parallel programming. By adaptively producing accurate estimates of the task model prior to execution, ATME modifies task distribution to improve the system and application performance.","PeriodicalId":325978,"journal":{"name":"Proceedings of 3rd International Conference on Algorithms and Architectures for Parallel Processing","volume":"68 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"1997-12-10","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"127703962","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 4
Parallel neural network training on Multi-Spert 基于Multi-Spert的并行神经网络训练
P. Farber, K. Asanović
{"title":"Parallel neural network training on Multi-Spert","authors":"P. Farber, K. Asanović","doi":"10.1109/ICAPP.1997.651531","DOIUrl":"https://doi.org/10.1109/ICAPP.1997.651531","url":null,"abstract":"Multi-Spert is a scalable parallel system built from multiple Spert-II nodes which we have constructed to speed error backpropagation neural network training for speech recognition research. We present the Multi-Spert hardware and software architecture, and describe our implementation of two alternative parallelization strategies for the backprop algorithm. We have developed detailed analytic models of the two strategies which allow us to predict performance over a range of network and machine parameters. The models' predictions are validated by measurements for a prototype five node Multi-Spert system. This prototype achieves a neural network training performance of over 530 million connection updates per second (MCUPS) while training a realistic speech application neural network. The model predicts that performance will scale to over 800 MCUPS for eight nodes.","PeriodicalId":325978,"journal":{"name":"Proceedings of 3rd International Conference on Algorithms and Architectures for Parallel Processing","volume":"8 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"1997-12-10","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"115834605","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 35
Parallel implementation of synthetic aperture radar on high performance computing platforms 合成孔径雷达在高性能计算平台上的并行实现
Jinwoo Suh, M. Ung, Viktor K. Prasanna
{"title":"Parallel implementation of synthetic aperture radar on high performance computing platforms","authors":"Jinwoo Suh, M. Ung, Viktor K. Prasanna","doi":"10.1109/ICAPP.1997.651522","DOIUrl":"https://doi.org/10.1109/ICAPP.1997.651522","url":null,"abstract":"We show a high throughput implementation of SAR on high performance computing (HPC) platforms. In our implementation, the processors are divided into two groups of size M and N. The first group consisting of M processors computes the FDC (frequency domain convolution) in range dimension, and the second group of N processors computes the FDC in azimuth dimension. M and N are determined by the computational requirements of FDC in range and azimuth dimensions respectively. The key contribution of this paper is the development of a general high-throughput M-to-N communication algorithm. The M-to-N communication algorithm is a basic communication primitive used in many signal processing applications when a software task pipeline is employed to obtain high throughput performance. Our algorithm reduces the number of communication steps to 1g(N/M+1)+n(k-1), where k/spl ges/2 and n=[1g/sub k/ M]. Implementation results on the IBM SP2 and the Cray T3D based on the MITRE real-time benchmarks are presented. The results show that, given an image of size 1K/spl times/1K, the minimum number of processors required for processing the SAR benchmarks can be reduced by 50% by using the proposed communication algorithm.","PeriodicalId":325978,"journal":{"name":"Proceedings of 3rd International Conference on Algorithms and Architectures for Parallel Processing","volume":"7 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"1997-12-10","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"123755019","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 13
An efficient local address generation for the block-cyclic distribution 一个有效的本地地址生成块循环分布
Oh-Young Kwon, Tae-Geun Kim, T. Han, Sung-Bong Yang, Shin-Dug Kim
{"title":"An efficient local address generation for the block-cyclic distribution","authors":"Oh-Young Kwon, Tae-Geun Kim, T. Han, Sung-Bong Yang, Shin-Dug Kim","doi":"10.1109/ICAPP.1997.651507","DOIUrl":"https://doi.org/10.1109/ICAPP.1997.651507","url":null,"abstract":"In order to generate local addresses for an array section A(l:h:s) with block-cyclic distribution, an efficient compiling method is required. In this paper, two local address generation methods for the block-cyclic distribution are presented. One is a simple local address generation method that is modified from the virtual-block scheme. The other is a linear-time /spl Delta/M table construction method. The array elements of A(l:h:s) to be accessed at run-time build up a family of lines. By using the equation of the lines, a /spl Delta/M table can be generated in O(k) time. Experimental results show that a simple local address generation method has poor performance but a linear-time /spl Delta/M table generation method is faster than other algorithms in /spl Delta/M table generation time and access time for 10,000 array elements.","PeriodicalId":325978,"journal":{"name":"Proceedings of 3rd International Conference on Algorithms and Architectures for Parallel Processing","volume":"10 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"1997-12-10","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"125662850","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 1
Adaptive routing for a bus-based multiprocessor 基于总线的多处理器的自适应路由
V. Fazio
{"title":"Adaptive routing for a bus-based multiprocessor","authors":"V. Fazio","doi":"10.1109/ICAPP.1997.651478","DOIUrl":"https://doi.org/10.1109/ICAPP.1997.651478","url":null,"abstract":"This paper describes and compares an implementation of an unusual hot-spot-resistant adaptive routing architecture. This paper evaluates the performance of the architecture.","PeriodicalId":325978,"journal":{"name":"Proceedings of 3rd International Conference on Algorithms and Architectures for Parallel Processing","volume":"40 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"1997-12-10","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"126580912","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
0
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
确定
请完成安全验证×
相关产品
×
本文献相关产品
联系我们:info@booksci.cn Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。 Copyright © 2023 布克学术 All rights reserved.
京ICP备2023020795号-1
ghs 京公网安备 11010802042870号
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术官方微信