{"title":"Solving the algebraic Riccati equation on a hypercube multiprocessor","authors":"J. Gardiner, A. Laub","doi":"10.1145/63047.63116","DOIUrl":"https://doi.org/10.1145/63047.63116","url":null,"abstract":"A parallel algorithm for solving the algebraic Riccati equation is described and its performance on an Intel iPSC/d5 is reported. Three variations of the matrix sign function algorithm are compared. The best one showed efficiencies of about 60 percent on large problems.","PeriodicalId":299435,"journal":{"name":"Conference on Hypercube Concurrent Computers and Applications","volume":"144 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"1989-01-03","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"115906475","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Dynamical simulations of granular materials using the Caltech hypercube","authors":"B. Werner, P. Haff","doi":"10.1145/63047.63085","DOIUrl":"https://doi.org/10.1145/63047.63085","url":null,"abstract":"A technique for simulating the motion of granular materials using the Caltech Hypercube is described. We demonstrate that grain dynamics simulations run efficiently on the Hypercube and therefore that they offer an opportunity for greatly expanding the use of parallel simulations in studying granular materials. Several examples, which illustrate how the simulations can be used to extract information concerning the behavior of granular materials, are discussed.","PeriodicalId":299435,"journal":{"name":"Conference on Hypercube Concurrent Computers and Applications","volume":"71 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"1989-01-03","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"115919672","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"An iterative solution to speical linear systems on a vector hypercube","authors":"L. G. Pillis, J. Petersen, J. Pillis","doi":"10.1145/63047.63130","DOIUrl":"https://doi.org/10.1145/63047.63130","url":null,"abstract":"An Intel Hypercube implementation of a new stationary iterative method developed by one of us (JdP) is presented. This algorithm finds the solution vector <italic>x</italic> for the invertible <italic>n</italic> × <italic>n</italic> linear system <italic>Ax</italic> = (<italic>I - B</italic>)<italic>x</italic> = <italic>f</italic> where <italic>A</italic> has real spectrum. The solution method converges quickly because the Jacobi iteration matrix <italic>B</italic> is replaced by an equivalent iteration matrix with a smaller spectral radius. The parallel algorithm partitions <italic>A</italic> row-wise among all the processors in order to keep memory load to a minimum and to avoid duplicate computations. With the introduction of vector hardware to the Hypercube, more modifications have been made to the implementation algorithm in order to exploit that hardware and reduce run-time even further. Example problems and timings will be presented.","PeriodicalId":299435,"journal":{"name":"Conference on Hypercube Concurrent Computers and Applications","volume":"2 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"1989-01-03","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"117043142","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"DIME: a programming environment for unstructured triangular meshes on a distributed-memory parallel processor","authors":"R. D. Williams","doi":"10.1145/63047.63136","DOIUrl":"https://doi.org/10.1145/63047.63136","url":null,"abstract":"DIME (Distributed Irregular Mesh Environment) is a user environment written in C for manipulation of an unstructured triangular mesh in two dimensions. The mesh is distributed among the separate memories of the processors, and communication between processors is handled by DIME; thus the user writes C-code referring to the elements and nodes of the mesh and need not be unduly concerned with the parallelism. A tool is provided for the user to make an initial coarse triangulation of a region, which may then be adaptively refined and load-balanced. DIME provides many graphics facilities for examining the mesh, including contouring and a Postscript hard-copy interface. DIME also runs on sequential machines.","PeriodicalId":299435,"journal":{"name":"Conference on Hypercube Concurrent Computers and Applications","volume":"24 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"1989-01-03","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"114932286","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Non-local path integral Monte Carlo on the hypercube","authors":"D. Callahan","doi":"10.1145/63047.63083","DOIUrl":"https://doi.org/10.1145/63047.63083","url":null,"abstract":"Some interesting physical properties of solid 3He are described and the physical observable calculated using Monte Carlo path integral techniques is defined. The relationship between the path integral and the observable is outlined. The parallel algorithm is explained and finally, timing results are presented for runs of the identical code on one parallel computer and two sequential computers: the NCUBE hypercube, the Cray XMP, the Elxsi 6400.","PeriodicalId":299435,"journal":{"name":"Conference on Hypercube Concurrent Computers and Applications","volume":"4 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"1989-01-03","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"127017387","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Parallel expert system search techniques for a real-time application","authors":"G. Lamont, D. Shakley","doi":"10.1145/63047.63090","DOIUrl":"https://doi.org/10.1145/63047.63090","url":null,"abstract":"Expert systems are being used to govern the intelligent control of the Robotic Air Vehicle (RAV) which is currently a research project at the Air Force Avionics Laboratory. Due to the nature of the RAV system the associated expert system needs to perform in a demanding real-time environment. The use of a parallel processing capability to support the associated computational requirement may be critical in this application. Thus, parallel search algorithms for real-time expert systems are designed, analyzed and synthesized on the Texas Instruments (TI) Explorer and Intel Hypercube. Examined is the process involved with transporting the RAV expert systems from the TI Explorer, where they are implemented in the Automated Reasoning Tool (ART), to the iPSC Hypercube, where the system is synthesized using Concurrent Common LISP. The performance characteristics of the parallel implementation of these expert systems on the iPSC Hypercube are compared to the TI Explorer implementation.","PeriodicalId":299435,"journal":{"name":"Conference on Hypercube Concurrent Computers and Applications","volume":"44 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"1989-01-03","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"125838806","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Parallel vision techniques on the hypercube computer","authors":"A. H. Bond, D. Fashena","doi":"10.1145/63047.63054","DOIUrl":"https://doi.org/10.1145/63047.63054","url":null,"abstract":"Parallel algorithms for programming low-level vision mechanisms on the JPL-Caltech hypercube are reported. These concern principally edge and region finding. 256x256 8bit images were used.\u0000We discuss the problem of programming a hypercube computer, and the Caltech approach to load balancing. We then discuss the distribution of images over the hypercube and the I/O problem for images.\u0000In edge finding, we programmed convolution using a separable kernel computational approach. This was tested with 5x5 and 32x32 masks.\u0000In region finding, we developed two different parallel histogram techniques. The first finds a global histogram for the image by a completely parallel technique. This method, which was developed from the Fox-Furmanski scalar product method, allows each histogram bucket to be computed by a separate processor, each processor regarding the hypercube as a different tree, and all buckets being computed in parallel by a complete interleaving of all communications required. Similarly the global histogram can then be distributed over the hypercube, so that all processors have the entire global histogram, by an completely parallel technique.\u0000The second histogramming method finds a spatially local histogram within each processor and then connects locally found regions together.\u0000Work in progress includes the application of a Hopfield neural net approach to region finding.","PeriodicalId":299435,"journal":{"name":"Conference on Hypercube Concurrent Computers and Applications","volume":"53 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"1989-01-03","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"124946261","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"What have we learnt from using real parallel machines to solve real problems?","authors":"Geoffrey C. Fox","doi":"10.1145/63047.63048","DOIUrl":"https://doi.org/10.1145/63047.63048","url":null,"abstract":"We briefly review some key scientific and parallel processing issues in a selection of some 84 existing applications of parallel machines. We include the MIMD hypercube transputer array, BBN Butterfly, and the SIMD ICL DAP, Goodyear MPP and Connection Machine from Thinking Machines. We use a space-time analogy to classify problems and show how a division into synchronous, loosely synchronous and asynchronous problems is helpful. This classifies problems into those suitable for SIMD or MIMD machines and isolates the asynchronous class as that for which major uncertainties as to possible parallelism exist. Interestingly about half of the scientific applications run excellently on SIMD machines with the other half able to take especial advantage of the MIMD architecture.","PeriodicalId":299435,"journal":{"name":"Conference on Hypercube Concurrent Computers and Applications","volume":"10 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"1989-01-03","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"121895808","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"FCP: a summary of performance results","authors":"Stephen Taylor, R. Shapiro, E. Shapiro","doi":"10.1145/63047.63092","DOIUrl":"https://doi.org/10.1145/63047.63092","url":null,"abstract":"Flat Concurrent Prolog is a simple concurrent programming language which has been used for a variety of non-trivial applications. A compiler based parallel implementation has been completed which operates on an Intel Hypercube. This paper presents a brief summary of performance data from a recent study of the implementation. Three categories of program were studied: parallel applications, uniprocessor benchmarks and communication stereotypes. The latter programs are abstractions of common parallel programming techniques and serve to quantify the cost of communication in the language.","PeriodicalId":299435,"journal":{"name":"Conference on Hypercube Concurrent Computers and Applications","volume":"27 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"1989-01-03","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"128232742","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Cholesky downdating on a hypercube","authors":"C. S. Henkel, M. Heath, R. Plemmons","doi":"10.1145/63047.63120","DOIUrl":"https://doi.org/10.1145/63047.63120","url":null,"abstract":"Least squares modifications associated with the addition or deletion of data often involve updating or downdating the Cholesky factor of the observation matrix. We describe and compare parallel implementations for the hypercube of three methods for down-dating the Cholesky factor: an orthogonal scheme, a hyperbolic scheme, and a hybrid scheme combining the first two. The computational complexities of these algorithms differ significantly, but the parallel implementations of all three have communication complexity similar to solving triangular systems. In computational tests on an Intel iPSC hypercube, the algorithms performed similarly, suggesting a preference for the orthogonal method based on stability considerations. The methods we describe can be adapted to the parallel computation of general orthogonal factorizations, but our discussion is motivated by applications in signal processing using windowed recursive least squares filtering for near real-time solutions.","PeriodicalId":299435,"journal":{"name":"Conference on Hypercube Concurrent Computers and Applications","volume":"34 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"1989-01-03","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"128395851","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}