Proceedings the First Aizu International Symposium on Parallel Algorithms/Architecture Synthesis最新文献

筛选
英文 中文
Parallel algorithms for extracting ridges and ravines 山脊和沟壑提取的并行算法
R. Huang, T. Kunii
{"title":"Parallel algorithms for extracting ridges and ravines","authors":"R. Huang, T. Kunii","doi":"10.1109/AISPAS.1995.401362","DOIUrl":"https://doi.org/10.1109/AISPAS.1995.401362","url":null,"abstract":"This paper proposes two parallel algorithms called an even region parallel algorithm (ERPA) and an even strip parallel algorithm (ESPA) respectively for extracting ridge and ravine geometric features of a surface. The parallel programs were implemented on a GCcl-1/64 T805 transputer based parallel machine with maximum 64 transputers. The performance of these two algorithms are reported and analyzed in respect of a load balance problem and communication overheads. The efficiency and speed-up versus the number of transputers used and the problem size chosen are shown and discussed.<<ETX>>","PeriodicalId":321580,"journal":{"name":"Proceedings the First Aizu International Symposium on Parallel Algorithms/Architecture Synthesis","volume":"47 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"1995-03-15","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"122080774","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 3
Program transformations and skeletons: formal derivation of parallel programs 程序转换和框架:并行程序的形式化派生
A. Geerling
{"title":"Program transformations and skeletons: formal derivation of parallel programs","authors":"A. Geerling","doi":"10.1109/AISPAS.1995.401332","DOIUrl":"https://doi.org/10.1109/AISPAS.1995.401332","url":null,"abstract":"The paper describes-from a software engineering perspective-a framework for the formal development of parallel algorithms on arbitrary architectures. The algorithms are synthesised in a transformational way, i.e. by applying correctness preserving rewrite rules to a formal specification. The architectures are modelled by skeletons-higher order functions that represent elementary computations on a certain architecture. It is shown that the combination of transformational programming and skeletons stimulates the reuse of program derivations. Furthermore, interskeleton transformations will provide the means for architecture independent program development.<<ETX>>","PeriodicalId":321580,"journal":{"name":"Proceedings the First Aizu International Symposium on Parallel Algorithms/Architecture Synthesis","volume":"31 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"1995-03-15","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"123906259","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 10
On the effect of spare positioning on the reconfigurability of two-dimensional processor arrays 备用定位对二维处理器阵列可重构性的影响
V. Obac Roda, T. Lin
{"title":"On the effect of spare positioning on the reconfigurability of two-dimensional processor arrays","authors":"V. Obac Roda, T. Lin","doi":"10.1109/AISPAS.1995.401343","DOIUrl":"https://doi.org/10.1109/AISPAS.1995.401343","url":null,"abstract":"We investigated some reconfiguration and routing aspects of fault tolerant processing arrays. An interconnection topology with disjoint buses for the horizontal and vertical connections, called \"double bus array\", was adopted. Reconfiguration of the array after diagnosis encompasses the allocation of spare units to replace the faulty processors, renaming of the processor elements and interconnecting (routing) data through the operating processors according to the initial specified operation. We fully simulated reconfiguration and routing for arrays of size N, from 5 to 25 processors and faults from 1 to 2N+1. Faults were generated randomly to simulate defects on a wafer. We present the results of the simulations and discuss the possible reasons for reliability improvements.<<ETX>>","PeriodicalId":321580,"journal":{"name":"Proceedings the First Aizu International Symposium on Parallel Algorithms/Architecture Synthesis","volume":"24 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"1995-03-15","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"116057455","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Improvement of duplication scheduling heuristic algorithm with nonstrict triggering of program graph nodes 程序图节点非严格触发复制调度启发式算法的改进
B. Benko, M. Ojsteršek, V. Zumer
{"title":"Improvement of duplication scheduling heuristic algorithm with nonstrict triggering of program graph nodes","authors":"B. Benko, M. Ojsteršek, V. Zumer","doi":"10.1109/AISPAS.1995.401321","DOIUrl":"https://doi.org/10.1109/AISPAS.1995.401321","url":null,"abstract":"The problem of multiprocessor scheduling can be stated as finding a schedule for a general task graph to be executed on a multiprocessor system so that the schedule length can be minimised. This scheduling problem is known to be NP-hard, and heuristic algorithms have been proposed to obtain optimal and suboptimal solutions. Duplication scheduling heuristic algorithm solves the max-min problem of parallel processor scheduling by duplicating selected scheduled tasks on some PEs. The max-min problem is caused by the trade-off between maximum parallelism versus minimum communication delay. This paper introduces an extension of the near optimal scheduling heuristic, based on a duplication scheduling heuristic. We have focused our research efforts to three main extensions of the original heuristic.<<ETX>>","PeriodicalId":321580,"journal":{"name":"Proceedings the First Aizu International Symposium on Parallel Algorithms/Architecture Synthesis","volume":"38 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"1995-03-15","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"115512750","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 4
A scalable performance analysis tool for PowerPC based MPP systems 基于PowerPC的MPP系统的可扩展性能分析工具
O. Hansen, J. Krammer
{"title":"A scalable performance analysis tool for PowerPC based MPP systems","authors":"O. Hansen, J. Krammer","doi":"10.1109/AISPAS.1995.401352","DOIUrl":"https://doi.org/10.1109/AISPAS.1995.401352","url":null,"abstract":"This paper introduces a tool for optimizing programs on massively parallel computing systems. The tool has been implemented for a PowerPC based parallel computing platform. It is scalable with respect to its implementation and an the way it presents performance data. A major feature contributing to the scalable representation of performance data is the ability to focus measurements on points of interest in the program execution by specifying behavioral attributes. Behavioral attributes are given as thresholds to the results of other measurements. Thus a direct link between results of different measurements can be made which enables the user to link global system behavior to the execution of individual program parts.<<ETX>>","PeriodicalId":321580,"journal":{"name":"Proceedings the First Aizu International Symposium on Parallel Algorithms/Architecture Synthesis","volume":"25 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"1995-03-15","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"114265673","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 2
Parallelizing a PDE solver: experiences with PISCES-MP 并行化PDE求解器:PISCES-MP的经验
B. Herndon, A. Raefsky, R. Dutton
{"title":"Parallelizing a PDE solver: experiences with PISCES-MP","authors":"B. Herndon, A. Raefsky, R. Dutton","doi":"10.1109/AISPAS.1995.401327","DOIUrl":"https://doi.org/10.1109/AISPAS.1995.401327","url":null,"abstract":"The paper presents a methodology for adapting dusty deck PDE solvers for parallel execution. Our approach minimizes changes to existing code and data structures, thereby preserving the value captured within dusty decks. This scheme uses the single program multiple data programming paradigm on message passing distributed memory architectures. To demonstrate the viability of our methodology the commercially available, dusty deck semiconductor device modeling program, PISCES, has been adapted for parallel execution. Simulating realistic complex device structures, we have achieved excellent performance gains over high performance serial workstations. Also, the scalability of the parallel simulator allows the simulation of structures too large for our existing serial computers.<<ETX>>","PeriodicalId":321580,"journal":{"name":"Proceedings the First Aizu International Symposium on Parallel Algorithms/Architecture Synthesis","volume":"84 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"1995-03-15","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"115560750","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 1
Hamiltonicity, vertex symmetry, and broadcasting of uni-directional hypercubes 单向超立方体的哈密顿性、顶点对称性和广播性
S. Chern, Tai-Ching Tuan, J. Jwo
{"title":"Hamiltonicity, vertex symmetry, and broadcasting of uni-directional hypercubes","authors":"S. Chern, Tai-Ching Tuan, J. Jwo","doi":"10.1109/AISPAS.1995.401339","DOIUrl":"https://doi.org/10.1109/AISPAS.1995.401339","url":null,"abstract":"We show that the two uni-directional n-cubes, namely UHC1/sub n/ and UHC2/sub n/ proposed by Chou and Du (1990) as interconnection schemes are Hamiltonian. In addition, we show that (1) if n is even, both architectures are vertex symmetric; and (2) if n is odd, both architectures have exactly two vertex-symmetric components. By studying symmetry, we further prove that the maximum delay of one-port one-to-all broadcasting for either architecture is at most 1.5n.<<ETX>>","PeriodicalId":321580,"journal":{"name":"Proceedings the First Aizu International Symposium on Parallel Algorithms/Architecture Synthesis","volume":"30 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"1995-03-15","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"131694331","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 1
Parallel polygon rendering on the graphics computer VC-1 在图形计算机VC-1上的平行多边形绘制
T. Kunii, S. Nishimura
{"title":"Parallel polygon rendering on the graphics computer VC-1","authors":"T. Kunii, S. Nishimura","doi":"10.1109/AISPAS.1995.401361","DOIUrl":"https://doi.org/10.1109/AISPAS.1995.401361","url":null,"abstract":"This paper describes a parallel polygon rendering method on the graphics computer VC-1. The architecture of the VC-1 is a loosely-coupled array of general-purpose processors, each of which is equipped with a local frame buffer. The contents of the local frame buffers are merged into one in real time considering the visibility control based on screen depth. In our polygon rendering method, polygons are distributed among the processors and each processor independently computes the image of the assigned polygons using the Z-buffer method. To achieve load balancing, a technique called adaptive parallel rasterization is developed. The adaptive parallel rasterization automatically selects the appropriate parallelizing approach according to the estimated size of polygons displayed on the screen. The measured rendering performance of VC-1 using this polygon rendering method is shown.<<ETX>>","PeriodicalId":321580,"journal":{"name":"Proceedings the First Aizu International Symposium on Parallel Algorithms/Architecture Synthesis","volume":"4 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"1995-03-15","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"130754650","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Cohesion: an efficient distributed shared memory system supporting multiple memory consistency models 内聚:一种高效的分布式共享内存系统,支持多种内存一致性模型
C. Shieh, An-Chow Lai, Jyh-Chang Ueng, Tyng-Yeu Liang, Tzu-Chiang Chang, Su-Cheong Mac
{"title":"Cohesion: an efficient distributed shared memory system supporting multiple memory consistency models","authors":"C. Shieh, An-Chow Lai, Jyh-Chang Ueng, Tyng-Yeu Liang, Tzu-Chiang Chang, Su-Cheong Mac","doi":"10.1109/AISPAS.1995.401322","DOIUrl":"https://doi.org/10.1109/AISPAS.1995.401322","url":null,"abstract":"This paper describes a prototype of DSM called Cohesion which supports two memory consistency models, namely Sequential consistency and Release consistency, within a single program to improve the performance and supports wide-variety of parallel programs for the system. Memory that is sequentially consistent is further divided into object-based and conventional (page-based) memory; where they are constructed in user-level and kernel-level, respectively. In object-based memory, the shared data are kept consistent at the granularity of an object; it is provided to improve the performance of the fine-grained parallel applications that may incur a significant overhead in conventional or release memory, as well as to eliminate unnecessary movement of the pages which are protected in a critical section. On the other hand, the Release consistency model is supported in Cohesion to attack the problem of excessive network traffic and false sharing. Cohesion programs are written in C++, and the annotation of shared objects for release and object-based memory is accomplished by inheriting a system-provided base class. Finally, three application programs including Matrix Multiplication, SOR, and Nbody have been employed to evaluate the efficiency of Cohesion. In addition, a Producer-Consumer program is tested to show that the object-based memory will benefit us in a critical section.<<ETX>>","PeriodicalId":321580,"journal":{"name":"Proceedings the First Aizu International Symposium on Parallel Algorithms/Architecture Synthesis","volume":"513 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"1995-03-15","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"123433492","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 17
Fault tolerant routing in toroidal networks 环形网络中的容错路由
Q. Gu, S. Peng
{"title":"Fault tolerant routing in toroidal networks","authors":"Q. Gu, S. Peng","doi":"10.1109/AISPAS.1995.401342","DOIUrl":"https://doi.org/10.1109/AISPAS.1995.401342","url":null,"abstract":"We give an O(r/sup 2/) time algorithm for constructing a fault-free routing path of optimal length between any true non-fault nodes of an r-dimensional torus with 2r-1 faulty nodes. We show that the Rabin diameter of a r-dimensional torus is its diameter plus one. We also describe a cluster fault tolerant (CFT) routing model and give an efficient algorithm for node-to-node CFT routing.<<ETX>>","PeriodicalId":321580,"journal":{"name":"Proceedings the First Aizu International Symposium on Parallel Algorithms/Architecture Synthesis","volume":"43 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"1994-09-22","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"133195075","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 13
0
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
确定
请完成安全验证×
相关产品
×
本文献相关产品
联系我们:info@booksci.cn Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。 Copyright © 2023 布克学术 All rights reserved.
京ICP备2023020795号-1
ghs 京公网安备 11010802042870号
Book学术文献互助
Book学术文献互助群
群 号:604180095
Book学术官方微信