{"title":"Massively Parallel Processing Project as a Priority Area of Research for the Ministry of Education","authors":"Hidehiko Tanaka","doi":"10.1109/ISPAN.1994.367141","DOIUrl":"https://doi.org/10.1109/ISPAN.1994.367141","url":null,"abstract":"Massively Parallel Processing Project started in 1992 as a Priority Area of Research for the Ministry of Education in Japan. The objective of this research project is to establish the basic technology of massively parallel processing which is expected to be the fundamental tool to develop the high-level technologies of 21 century. The main goal of this project is to build up a system prototype of massively parallel processing system. This paper describes the organization of this project and discusses the research results up to this time.<<ETX>>","PeriodicalId":142405,"journal":{"name":"Proceedings of the International Symposium on Parallel Architectures, Algorithms and Networks (ISPAN)","volume":"1 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"1994-12-14","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"126413143","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"A logic semantics for a class of nondeterministic concurrent constraint logic programs","authors":"Ho-fung Leung, Bo-Ming Tong","doi":"10.1109/ISPAN.1994.367181","DOIUrl":"https://doi.org/10.1109/ISPAN.1994.367181","url":null,"abstract":"M.J. Maher (1987) proposed the ALPS class of committed-choice languages, which can be seen as a further development of concurrent logic programming languages in the direction of CLP(X). However, due to the lack of OR-nondeterminism, ALPS is a class of declarative algorithmic programming languages. In this paper, we present the FENG class of concurrent constraint logic programming languages and give its soundness and completeness results. With the novel feature of constraint based nondeterminism, FENG enriches the semantics of the ALPS and CLP(X). One of the features of FENG is that it supports constraint bared nondeterminism. For some class of programs, this improves the efficiency of program execution. FENG reveals a direction for data-parallel implementations of constraint logic programs. This has been confirmed by our experience in design and implementation of Firebird, a restriction of FENG, on the massively parallel machine DECmpp.<<ETX>>","PeriodicalId":142405,"journal":{"name":"Proceedings of the International Symposium on Parallel Architectures, Algorithms and Networks (ISPAN)","volume":"46 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"1994-12-14","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"127062659","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Parallel relational database algorithms revisited for range declustered data sets","authors":"E. Schikuta","doi":"10.1109/ISPAN.1994.367168","DOIUrl":"https://doi.org/10.1109/ISPAN.1994.367168","url":null,"abstract":"Today available parallel database systems use conventional parallel hardware architectures employing a highly parallel software architecture. It is an emerging technique to speed up the execution by declustering the stored data sets among a number of parallel and independent disk drives. In this paper we revisit parallel relational database algorithms for range declustering. We adapt the conventional known and well studied parallel algorithms for declustered data, exploit the inherent order property of the partitioned data sets and compare analytically the performance of the algorithms. It is shown that the parallel range declustered variants generally outperform their conventional parallel counterparts.<<ETX>>","PeriodicalId":142405,"journal":{"name":"Proceedings of the International Symposium on Parallel Architectures, Algorithms and Networks (ISPAN)","volume":"69 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"1994-12-14","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"132012477","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Transit: reliable high speed interconnection technology","authors":"T. Knight","doi":"10.1109/ISPAN.1994.367180","DOIUrl":"https://doi.org/10.1109/ISPAN.1994.367180","url":null,"abstract":"Important improvements in network bandwidth, latency, and fault tolerance can be provided by careful selection of the protocols, choice of network topology, details of interconnection wiring, and basic wire driving technologies. We examine the improvements in some of these areas as part of the Transit project at the MIT Artificial Intelligence Laboratory.<<ETX>>","PeriodicalId":142405,"journal":{"name":"Proceedings of the International Symposium on Parallel Architectures, Algorithms and Networks (ISPAN)","volume":"33 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"1994-12-14","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"128336023","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Cenju-3 parallel computer and its application to CFD","authors":"K. Muramatsu, S. Doi, T. Washio, T. Nakata","doi":"10.1109/ISPAN.1994.367184","DOIUrl":"https://doi.org/10.1109/ISPAN.1994.367184","url":null,"abstract":"The exploitation of the effectiveness of parallel computers for computational fluid dynamics (CFD) requires an efficient parallel algorithm for numerical solution of the Navier-Stokes equations. This paper discusses the parallelization of an incompressible Navier-Stokes equation solver, and its implementation to the Cenju-3 parallel computer. We mainly focus our attention to the parallel solution of the discrete Poisson equations, which solution consumes the most of the computational time. Two parallel linear solvers are implemented based on a Rectangular Domain Decomposition Method (RDDM) and on the SMAC scheme. A parallel linear solver implemented is the Blocked MICCG (B-MICCG) method introduced by Washio and Hayami. The other linear solver is the Multi-Grid preconditioned Bi-CGSTAB(MG-Bi-CGSTAB) method. Numerical experiments has been conducted on Cenju-3 for up to 25 processors. Although the MG-Bi-CGSTAB method produces poorer speedup than the B-MICCG method, the MG-Bi-CGSTAB method is about 4 times faster than the B-MICCG method in terms of the total execution time.<<ETX>>","PeriodicalId":142405,"journal":{"name":"Proceedings of the International Symposium on Parallel Architectures, Algorithms and Networks (ISPAN)","volume":"1 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"1994-12-14","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"128781137","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"A parallel algorithm for the single step searching problem","authors":"J. S. Lin, F. Hsu, Richard C. T. Lee","doi":"10.1109/ISPAN.1994.367138","DOIUrl":"https://doi.org/10.1109/ISPAN.1994.367138","url":null,"abstract":"Introduces the single-step searching problem, which is defined as follows. We are given a graph where each vertex is associated with a weight. Assume that every edge of graph is of equal length. A fugitive may be hidden in any edge. We are asked to assign searchers to vertices to search the entire graph in one step such that no fugitive can escape. The cost of a searching plan is related to the weights of the vertices in which the searchers are initially located. Our goal is to minimize the cost of the searching plan. A parallel algorithm based upon The EREW model is proposed to solve this problem. This algorithm applies the tree contraction technique. The critical point is that we have to transform a general tree into a binary tree, including pseudo-nodes, in order to apply this tree contraction technique. A new algorithm is devised to solve the problem on the transformed binary tree. It can be proved that this new algorithm is correct, as it produces a correct solution for the original tree. Our algorithm has an optimal speed-up.<<ETX>>","PeriodicalId":142405,"journal":{"name":"Proceedings of the International Symposium on Parallel Architectures, Algorithms and Networks (ISPAN)","volume":"27 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"1994-12-14","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"125971275","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Recursive circulant: a new topology for multicomputer networks (extended abstract)","authors":"Jung-Heum Park, Kyung-Yong Chwa","doi":"10.1109/ISPAN.1994.367162","DOIUrl":"https://doi.org/10.1109/ISPAN.1994.367162","url":null,"abstract":"We propose a new topology for multicomputer networks, called recursive circulant. Recursive circulant G(N, d) is defined to be a circulant graph with N nodes and jumps of powers of d, d/spl ges/2. G(N, d) is node symmetric, has a hamiltonian cycle unless N/spl les/2, and can be recursively constructed when N=cd/sup m/, 1/spl les/c/spl les/d. We analyze various network metrics of G(cd/sup m/, d) such as connectivity, diameter, mean internode distance, visit ratio, and develop a shortest path routing algorithm in G(cd/sup m/, d). G(2/sup m/, 4), whose degree is m, compares favorably to the hypercube Q/sub m/. G(2/sup m/, 4) has the maximum connectivity, and its diameter is [(3m-1)/4]. A simple broadcasting algorithm in G(2/sup m/, 4) is also presented.<<ETX>>","PeriodicalId":142405,"journal":{"name":"Proceedings of the International Symposium on Parallel Architectures, Algorithms and Networks (ISPAN)","volume":"19 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"1994-12-14","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"129203794","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"A shortest path algorithm for banded matrices by a mesh connection without processor penalty","authors":"Aohan Mei, Y. Igarashi","doi":"10.1109/ISPAN.1994.367151","DOIUrl":"https://doi.org/10.1109/ISPAN.1994.367151","url":null,"abstract":"We give an efficient shortest path algorithm on a mesh-connected processor array for n/spl times/n banded matrices with bandwidth b. We use a [b/2]/spl times/[b/2] semisystolic processor array. The input data is supplied to the processors array from the host computer. The output from the processor array can be also supplied to itself through the host computer. This algorithm computes all pair shortest distances within the band in 7n-4[b/2]-1 steps.<<ETX>>","PeriodicalId":142405,"journal":{"name":"Proceedings of the International Symposium on Parallel Architectures, Algorithms and Networks (ISPAN)","volume":"14 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"1994-12-14","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"127782116","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Processing nested loop structure with data-flow dependence on a CAM-based processor HAPP","authors":"K. Lu, K. Tamaru","doi":"10.1109/ISPAN.1994.367156","DOIUrl":"https://doi.org/10.1109/ISPAN.1994.367156","url":null,"abstract":"We know that a significant advantage of content addressable memory (CAM) is that operations are performed locally, thus it can eliminate the problem of bottleneck between processor and memory. In this paper, we propose a CAM-based associative processing processor (HAPP) which is able to combine with a general processor to form an array-processor system, and besides retrieval operations, it can assist the general processor to manipulate nested loop structure with data-flow dependence for achieving high speedup in this system. We enumerate some problems of applying HAPP to a computer system to deal with nested loop structure, and the methods we used to resolve them. Also we compare HAPP with a parallel machine, BBN TC2000, to prove that HAPP gains a smaller communication penalty when the number of data items access of BBN TC2000 surpasses penalty plane.<<ETX>>","PeriodicalId":142405,"journal":{"name":"Proceedings of the International Symposium on Parallel Architectures, Algorithms and Networks (ISPAN)","volume":"12 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"1994-12-14","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"128023588","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Building a better butterfly: the multiplexed metabutterfly","authors":"F. Chong, E. Brewer, F. Leighton, T. Knight","doi":"10.1109/ISPAN.1994.367163","DOIUrl":"https://doi.org/10.1109/ISPAN.1994.367163","url":null,"abstract":"Multistage networks are important in a wide variety of applications. Expander-based networks, such as multibutterflies, are a tremendous improvement over traditional butterflies in both fault and congestion tolerance. However, multibutterflies cost at least twice as much in chips and wiring as butterflies. It is also impossible to build large multibutterflies due to their wiring complexity. We show that we can build an expander-based network that has comparable cost to a butterfly with the same number of endpoints, yet has substantially better fault and congestion performance. Specifically, we introduce a hierarchical construction that dramatically reduces wiring complexity and makes large expanders buildable. We are able to exploit the hierarchical structure to find large numbers of logical wires to multiplex over a smaller number of physical wires. Since many of the wires in an expander-based network are used to provide alternate paths, not useful bandwidth, substantial multiplexing can be done without significantly degrading performance. We present simulation results to support our conclusions. In comparing a butterfly with the comparable 2-to-1 multiplexed metabutterfly, we found that the metabutterfly performed better by nearly a factor of two on random traffic and greater than a factor five on worst-case traffic.<<ETX>>","PeriodicalId":142405,"journal":{"name":"Proceedings of the International Symposium on Parallel Architectures, Algorithms and Networks (ISPAN)","volume":"1 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"1994-12-14","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"115682449","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}