{"title":"Divacon: a parallel language for scientific computing based on divide-and-conquer","authors":"Z. G. Mou","doi":"10.1109/FMPC.1990.89496","DOIUrl":"https://doi.org/10.1109/FMPC.1990.89496","url":null,"abstract":"An overview of the language, covering Divacon primitives and simple programming constructs that are referred to as functional forms, is given. Two divide-and-conquer programming constructs are discussed. Divacon style programming is demonstrated for a number of scientific applications. Some interesting equivalences and transformations between Divacon programs are examined. Implementation and performance are briefly considered.<<ETX>>","PeriodicalId":193332,"journal":{"name":"[1990 Proceedings] The Third Symposium on the Frontiers of Massively Parallel Computation","volume":"14 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"1990-10-08","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"129678682","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"A distributed backpropagation algorithm of neural networks on distributed-memory multiprocessors","authors":"H. Yoon, J.H. Nang, S. Maeng","doi":"10.1109/FMPC.1990.89482","DOIUrl":"https://doi.org/10.1109/FMPC.1990.89482","url":null,"abstract":"A distributed backpropagation algorithm for a fully connected multilayered neural network on a distributed-memory multiprocessor system is presented. The neurons on each layer are partitioned into p disjoint sets, and each set is mapped on a processor of a p-processor system. The algorithm, the communication pattern among the processors, and their time/space complexities are investigated, and the theoretical upper bound on speedup is obtained. The experimental speedup obtained with the algorithm on a ring of 32 transputers, which confirms the model and analysis, is reported. It is found that the choice of processor interconnection topology does not influence the speedup ratio.<<ETX>>","PeriodicalId":193332,"journal":{"name":"[1990 Proceedings] The Third Symposium on the Frontiers of Massively Parallel Computation","volume":"7 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"1990-10-08","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"121879167","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Deterministic PRAM simulation with constant memory blow-up and no time-stamps","authors":"Y. Aumann, A. Schuster","doi":"10.1109/FMPC.1990.89431","DOIUrl":"https://doi.org/10.1109/FMPC.1990.89431","url":null,"abstract":"A scheme for deterministic simulation of a parallel random-access machine (PRAM) on a module parallel computer or on bounded-degree networks is described. The scheme requires only a constant memory blowup, thus achieving better memory utilization than previously known approaches. The method does not need time stamps, which were a basic element of all previous schemes. The improvements are achieved by adopting error-correcting-code techniques. Several coding methods are considered, tradeoffs between memory utilization, run time, and the size of the PRAM shared memory are derived.<<ETX>>","PeriodicalId":193332,"journal":{"name":"[1990 Proceedings] The Third Symposium on the Frontiers of Massively Parallel Computation","volume":"1 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"1990-10-08","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"126148467","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Parallel optimization of stack filters","authors":"K. Rao, K. Efe, C. H. Chu","doi":"10.1109/FMPC.1990.89505","DOIUrl":"https://doi.org/10.1109/FMPC.1990.89505","url":null,"abstract":"An open problem associated with designing stack filters is finding the optimum configuration for a given noise type and the signal characteristics which need to be preserved. This problem is modeled here as a combinatorial search problem. Efficient search methods that can be easily implemented on any massively parallel computer were developed and tested in two parallel computing environments. The first is the Connection Machine, and the second is a hypercube-connected MIMD (multiple-instruction-stream, multiple-data-stream) machine simulated using Cosmic C. The performance of the filters found by the algorithms developed were excellent in comparison with the performance of the median filter. The efficiency of the algorithms clearly demonstrates the potential of using them for adaptive filtering. The algorithms can be implemented on any type of parallel computer.<<ETX>>","PeriodicalId":193332,"journal":{"name":"[1990 Proceedings] The Third Symposium on the Frontiers of Massively Parallel Computation","volume":"90 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"1990-10-08","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"129453117","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Generalized supercube: an incrementally expandable interconnection network","authors":"Arunabha Sen, A. Sengupta, S. Bandyopadhyay","doi":"10.1109/FMPC.1990.89488","DOIUrl":"https://doi.org/10.1109/FMPC.1990.89488","url":null,"abstract":"A class of incrementally expandable interconnection networks with high connectivity and low diameter is introduced for massively parallel and distributed processing. This class of networks can be constructed for any number of computing nodes, and the network size can easily be incremented without a major reconfiguration of the network. The connectivity and the diameter of the network are on the order of the logarithm of the number of nodes. It is shown that the connectivity of the network is equal to the minimum node degree. In this sense the connectivity is optimal. The routing algorithms for the network ar simple to implement.<<ETX>>","PeriodicalId":193332,"journal":{"name":"[1990 Proceedings] The Third Symposium on the Frontiers of Massively Parallel Computation","volume":"10 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"1990-10-08","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"121524972","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Topological properties of banyan-hypercube networks","authors":"A. Youssef, B. Narahari","doi":"10.1109/FMPC.1990.89478","DOIUrl":"https://doi.org/10.1109/FMPC.1990.89478","url":null,"abstract":"Topological properties of banyan-hypercubes are discussed, and a family of generalized banyan-hypercubes is defined. A banyan-hypercube, denoted BH(h, k, s), is constructed by taking the bottom h levels of a rectangular banyan of spread s and s/sup k/ nodes per level for s a power of two, and interconnecting the nodes at each level in a hypercube. BHs can be viewed as a scheme for interconnecting hypercubes while keeping most of the advantages of the latter. The definition of BHs is extended and generalized to allow the interconnection of an unlimited number of hypercubes and to allow any h successive levels of the banyan to interconnect hypercubes. This leads to better extendibility and flexibility in partitioning the BH. The diameter and average distance of the generalized BH are derived and are shown to provide an improvement over the hypercube for a wide range of h, k, and s values. Self-routing point-to-point and broadcasting algorithms are presented, and efficient embeddings of various networks on the BH are shown.<<ETX>>","PeriodicalId":193332,"journal":{"name":"[1990 Proceedings] The Third Symposium on the Frontiers of Massively Parallel Computation","volume":"24 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"1990-10-08","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"114358324","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Parallel relational operations based on clustered surrogate files","authors":"S. M. Chung","doi":"10.1109/FMPC.1990.89463","DOIUrl":"https://doi.org/10.1109/FMPC.1990.89463","url":null,"abstract":"In the context of very large databases, the main problem is how to process relational operations in the minimum amount of time to satisfy user queries. To speed up the relational operations on very large databases, parallel processing is essential. A reasonable indexing scheme for parallel processing systems is the concatenated code word (CCW) surrogate file, which is small in size and requires simple maintenance. Since interrelated relational operations can be performed on the CCW surrogate files, considerable processing time can be saved by performing interrelated relational operations on CCW files before the large data files are accessed. CCW surrogate files can be satisfactorily mapped into parallel architectures because their structure is quite compact and regular. To speed up the relational operations based on CCW surrogate files. it is possible to cluster the CCW surrogate files. If a CCW surrogate file is clustered, only a subset of the surrogate file is searched for a relational operation. Clustered-CCW surrogate file and data file structures suitable for a parallel processing system are introduced. Parallel relational operation algorithms based on the clustered file structures are developed and evaluated.<<ETX>>","PeriodicalId":193332,"journal":{"name":"[1990 Proceedings] The Third Symposium on the Frontiers of Massively Parallel Computation","volume":"5 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"1990-10-08","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"115341374","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Toward scalable algorithms for orthogonal shared-memory parallel computers","authors":"I. Scherson, A. Mehra, J. Rexford","doi":"10.1109/FMPC.1990.89430","DOIUrl":"https://doi.org/10.1109/FMPC.1990.89430","url":null,"abstract":"The problem of developing scalable and near-optimal algorithms for orthogonal shared-memory multiprocessing systems with a multidimensional access (MDA) memory array is considered. An orthogonal shared-memory system consists of 2/sup n/ processors and 2/sup m/ memory modules accessed in any one of m possible access modes. Data stored in memory modules are available to processors under a mapping rule that allows conflict-free data reads and writes for any given access mode. Scalable algorithms are presented for two well-known computational problems, namely, matrix multiplication and the fast Fourier transform (FFT). A complete analysis of the algorithms based on computational time and the access modes needed is also presented. The algorithms scale very well onto higher dimensional MDA architectures but are not always optimal. This reveals a tradeoff between the scalability of an algorithm and its optimality in the MDA computational model.<<ETX>>","PeriodicalId":193332,"journal":{"name":"[1990 Proceedings] The Third Symposium on the Frontiers of Massively Parallel Computation","volume":"8 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"1990-10-08","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"123400285","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
M. Evett, James A. Hendler, A. Mahanti, Dana S. Nau
{"title":"PRA*: a memory-limited heuristic search procedure for the Connection Machine","authors":"M. Evett, James A. Hendler, A. Mahanti, Dana S. Nau","doi":"10.1109/FMPC.1990.89450","DOIUrl":"https://doi.org/10.1109/FMPC.1990.89450","url":null,"abstract":"A variant of A* search designed to run on the massively parallel SIMD (single-instruction-stream, multiple-data-steam) Connection Machine is described. The algorithm is designed to run in a limited memory; a retraction technique allows nodes with poor heuristic values to be removed from the open list until such time as they may need reexpansion if more promising paths fail. The algorithm, called PRA* (for parallel retraction A*), takes maximum advantage of the SIMD design of the Connection Machine and is guaranteed to return an optimal path when an admissible heuristic is used. Results comparing PRA* to R. Korf's IDA* (see Artif. Intell. J., vol.27, 1985) for the 15 puzzle show significantly fewer node expansions for PRA*.<<ETX>>","PeriodicalId":193332,"journal":{"name":"[1990 Proceedings] The Third Symposium on the Frontiers of Massively Parallel Computation","volume":"17 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"1990-10-08","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"128736284","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"The GPA machine: a generally partitionable MSIMD architecture","authors":"T. Bridges","doi":"10.1109/FMPC.1990.89460","DOIUrl":"https://doi.org/10.1109/FMPC.1990.89460","url":null,"abstract":"The GPA machine, a massively parallel, multiple single-instruction-stream-multiple-data-stream (MSIMD) system is described. Its distinguishing characteristics is the generality of its partitioning capabilities. Like the PASM system it can be dynamically reconfigured to operate as one or more independent SIMD machines. However, unlike PASM, the only constraint placed on partitioning is that an individual processing element is a member of at most one partition. This capability allows for reconfiguration based on the run-time status of dynamic data structures and for partitioning of disconnected and overlapping data structures. Significant speedups are expected from operating on data structures in place; copying of data to a newly configured partition is unnecessary. The GPA system consists of N processing-element/RAM pairs and an interconnection network providing access to and from P control processors or microcontrollers. With current technologies, values for N and P of 64K and 16, respectively, are feasible.<<ETX>>","PeriodicalId":193332,"journal":{"name":"[1990 Proceedings] The Third Symposium on the Frontiers of Massively Parallel Computation","volume":"13 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"1990-10-08","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"125532124","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}