Anup Kumar, S. Ramakrishnan, Chinar Deshpande, L. Dunning
{"title":"Performance Comparison of Two Algorithms for Task Assignment","authors":"Anup Kumar, S. Ramakrishnan, Chinar Deshpande, L. Dunning","doi":"10.1109/ICPP.1994.159","DOIUrl":"https://doi.org/10.1109/ICPP.1994.159","url":null,"abstract":"In this article we investigate a new algorithm for solving the optimal task assignment problem. The assignment is based on Stone's \"throughput\" metric with the optimality criteria being the total cost for execution and interprocess communication. Two approaches studied are a new approach based on genetic algorithms and A*, a well known tree search algorithm for solving the same problem. We use the algorithm execution time as a performance criteria for the two algorithms. It is shown that the genetic algorithm techniques are more favorable than A* for larger search spaces while for smaller search spaces A* is preferred.","PeriodicalId":162043,"journal":{"name":"1994 International Conference on Parallel Processing Vol. 3","volume":"17 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"1994-08-15","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"116263193","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Improving the Performance of Global Communication on a 3D Torus Network","authors":"Yasushi Kawakura, S. Oyanagi","doi":"10.1109/ICPP.1994.118","DOIUrl":"https://doi.org/10.1109/ICPP.1994.118","url":null,"abstract":"The authors developed an one-to-all broadcasting algorithm which is less dependent on the increase of processors, by reducing the overhead of each communications, and by shortening the message transmission sequence. The result shows that the program which employs this algorithm can achieve a 2.8 times speedup in the case of a 32K processor system.","PeriodicalId":162043,"journal":{"name":"1994 International Conference on Parallel Processing Vol. 3","volume":"61 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"1994-08-15","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"126302022","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"A Parallel Algorithm for an Inverse Problem Associated with a Hyperbolic System of Partial Differential Equations","authors":"P. Nelson, Mike Phillips","doi":"10.1109/ICPP.1994.40","DOIUrl":"https://doi.org/10.1109/ICPP.1994.40","url":null,"abstract":"Parallel computing was applied to the solution of an inverse problem arising from a hyperbolic system of two coupled linear first-order partial differential equations. A known sequential algorithm based on the method of invariant imbedding was parallelized by a mapping assigning processors to characteristics. Two implementations of an algorithm based on this mapping and suitable for a message-passing architecture, but differing in their relative demands on communication and local memory, are described. Timing estimates for these are developed, and various consequences of these, notably the corresponding estimates for efficiency, are compared with computational results obtained from implementations on a 64-node hypercube system.","PeriodicalId":162043,"journal":{"name":"1994 International Conference on Parallel Processing Vol. 3","volume":"6 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"1994-08-15","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"126740872","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
S. Hariri, Rajesh Yadav, B. Thiagarajan, Sung-Yong Park, Mahesh Subramanyan, Rajashekar Reddy, G. Fox
{"title":"A Concurrent Multi Target Tracker: Benchmarking and Portability","authors":"S. Hariri, Rajesh Yadav, B. Thiagarajan, Sung-Yong Park, Mahesh Subramanyan, Rajashekar Reddy, G. Fox","doi":"10.1109/ICPP.1994.20","DOIUrl":"https://doi.org/10.1109/ICPP.1994.20","url":null,"abstract":"With the current advances in computing and network technology and software, the gap between parallel and distributed computing environment is gradually becoming narrower. Consequently, parallel programs run on parallel as well as distributed systems. However, programming and porting complex applications to such environment is challenging task and not well understood. In this paper, we use a concurrent multi target tracker as a running example to analyze and evaluate performance of two different parallel implementations on parallel and distributed systems. We have benchmarked both these implementations on different architectures that vary from a network of worksta¿ tions{SUN, IBM RS6000) to parallel computers (CM5, iPSC 860) using different parallel/distributed message passing tools{PVM, p4, EXPRESS).","PeriodicalId":162043,"journal":{"name":"1994 International Conference on Parallel Processing Vol. 3","volume":"37 33","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"1994-08-15","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"120865520","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Adaptive Algorithm-Based Fault Tolerance for Parallel Computing in Linear Systems","authors":"J. Khan, Woei Lin, D. Yun","doi":"10.1109/ICPP.1994.49","DOIUrl":"https://doi.org/10.1109/ICPP.1994.49","url":null,"abstract":"This paper presents a dynamically adaptive stabilization scheme for parallel matrix computation. The scheme performs automatic error detection and correction through inserting redundant, but concurrent tracer computations within the folds of the regular computation. It also eliminates the costly row interchange used in classical pivoting. A fault-tolerant double wavefront matrix algorithmfor a MIMD array multi-processor with toroidal inter connection has been designed to demonstrate the strength of the proposed scheme. This algorithm can compute: i) matrix inverse ii) solution vector to the linear system and Hi) predetermined linear combination of the solution vector from identical algorithmic framework. This efficient tri-solution algorithm excels most other known methods in parallel performance.","PeriodicalId":162043,"journal":{"name":"1994 International Conference on Parallel Processing Vol. 3","volume":"31 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"1994-08-15","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"131187491","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Parallel Recursive Computations where Both Recombination and Partition Overheads are Problem-Dependent","authors":"A. Saha, M. D. Wagh","doi":"10.1109/ICPP.1994.152","DOIUrl":"https://doi.org/10.1109/ICPP.1994.152","url":null,"abstract":"Parallel recursive computations incorporating the unavoidable and significant parallel computing overheads, encompassing a wide variety of applications, can be modeled as T(n) = left{ {mathop {min }limits_{0 le r le n}^{t_{(),} } } right.left{ {max left{ {T(n - r),T(r) + k(r)} right} + mathop {mathop {lambda (n,r)}limits_{otherwise} }limits^{forn le n_{(),} } } right} where k(r) and X(n,r) represent the partition and recombination overheads respectively. The optimal partition size (solution to r of the above minmax recurrence relation) is nontrivial and is very different from the n/2 value conventionally used. Using the optimal partitions at every stage of the recursion enhances the performance greatly. In this paper we solve a challenging case of our parallel recursive model where the overhead functions are problem-dependent.","PeriodicalId":162043,"journal":{"name":"1994 International Conference on Parallel Processing Vol. 3","volume":"494 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"1994-08-15","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"125004315","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Digital Circuit Testing on a Network of Workstations","authors":"S. Srinivasan, J. Aylor","doi":"10.1109/ICPP.1994.89","DOIUrl":"https://doi.org/10.1109/ICPP.1994.89","url":null,"abstract":"This paper presents preliminary results of research being conducted in the application of distributed/parallel algorithms to automatic test pattern generation for synchronous sequential circuits. The system is required to be portable across a wide variety of architectures -from workstations to dedicated parallel machines. This requirement necessitates the use of a parallel programming environment that provides a layer of abstraction between the application and the hardware. The first phase of the proposed system involves generation of circuit covers - input patterns that control the circuit outputs.","PeriodicalId":162043,"journal":{"name":"1994 International Conference on Parallel Processing Vol. 3","volume":"1 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"1994-08-15","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"128503718","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Distributed Approach for Implementing Genetic Algorithms","authors":"A. Srivastava, Anup Kumar, R. M. Pathak","doi":"10.1109/ICPP.1994.92","DOIUrl":"https://doi.org/10.1109/ICPP.1994.92","url":null,"abstract":"Genetic Algorithms are search techniques for global optimization in a complex search space. One of the interesting features of a Genetic Algorithm is that they lend themselves very well for parallel and distributed processing. This feature of Genetic Algorithm is useful in improving its computation efficiency for complex optimization problems. In this paper, we have implemented Genetic Algorithm in a distributed environment such that its implementation problem independent. This key attribute of distributed implementation allows it to be used for different types of optimization problems. Fault tolerance and user transparency are two other important features of our distributed Genetic Algorithm implementation. The effectiveness and generality of Genetic Algorithms have been demonstrated by solving two problems of network topology design and file allocation.","PeriodicalId":162043,"journal":{"name":"1994 International Conference on Parallel Processing Vol. 3","volume":"20 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"1994-08-15","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"126813962","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"On Solving Block Toeplitz Systems Using a Block Schur Algorithm","authors":"K. Gallivan, S. Thirumalai, P. Dooren","doi":"10.1109/ICPP.1994.136","DOIUrl":"https://doi.org/10.1109/ICPP.1994.136","url":null,"abstract":"This paper presents a block Schur algorithm to obtain a factorization of a symmetric block Toeplitz matrix. We develop a version based on block hyperbolic Householder reflectors by adapting the representation schemes for block Householder reflectors to the hyperbolic case. If a singular principal submatrix is encountered during the factorization, the matrix is perturbed and an approximate factorization is obtained. This is then combined with iterative refinement to obtain the final solution.","PeriodicalId":162043,"journal":{"name":"1994 International Conference on Parallel Processing Vol. 3","volume":"5 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"1994-08-15","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"132855793","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Efficient CRCW PRAM Emulation on Practical Networks","authors":"M. Hamdi","doi":"10.1109/ICPP.1994.100","DOIUrl":"https://doi.org/10.1109/ICPP.1994.100","url":null,"abstract":"A new interconnection network is proposed for the construction of massively parallel computers. The systematic construction of this network, denoted RCNFULL, is performed by methodically connecting together a number of basic atoms where a basic atom is a set of fully connected nodes. Key communication characteristics and efficient routing algorithms are derivedfor RCN-FULL. An 0(log(N)) sorting algorithm is shown for RCN-FULL and RCN-FULL is proven to deterministically emulate the CRCW PRAM model, with only O(log{N)) degradation in time performance. Finally, the hardware cost for the RCNFULL is estimated as a function of its pin limitations and compared favorably to that of the hypercube.","PeriodicalId":162043,"journal":{"name":"1994 International Conference on Parallel Processing Vol. 3","volume":"9 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"1994-08-15","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"133581091","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}