{"title":"Parallel algorithms for fast computation of normalized edit distances","authors":"Ö. Eğecioğlu, Maximilian Ibel","doi":"10.1109/SPDP.1996.570374","DOIUrl":"https://doi.org/10.1109/SPDP.1996.570374","url":null,"abstract":"The authors give work-optimal and polylogarithmic time parallel algorithms for solving the normalized edit distance problem. The normalized edit distance between two strings X and Y with lengths n/spl ges/m is the minimum quotient of the sum of the costs of edit operations transforming X into Y by the length of the edit path corresponding to those edit operations. Marzal and Vidal (1993) proposed a sequential algorithm with a time complexity of O(nm/sup 2/). They show that this algorithm can be parallelized work-optimally on an array of n (or m) processors, and on a mesh of n/spl times/m processors. They then propose a sublinear time algorithm that is almost work-optimal: using O(mn/sup 1.75/) processors, the time complexity of the algorithm is O(n/sup 0.75/ log n) and the total number of operations is O (mn/sup 2.5/ log n). This algorithm runs on a CREW PRAM, but is likely to work on weaker PRAM models and hypercubes with minor modifications. Finally, they present a polylogarithmic O(log/sup 2/ n) time algorithm based on matrix multiplication which runs on a O(n/sup 6//log n) processor hypercube.","PeriodicalId":360478,"journal":{"name":"Proceedings of SPDP '96: 8th IEEE Symposium on Parallel and Distributed Processing","volume":"7 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"1996-10-23","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"123844547","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
K. Kanth, D. Agrawal, A. E. Abbadi, Ambuj K. Singh, Terence R. Smith
{"title":"Parallelizing multidimensional index structures","authors":"K. Kanth, D. Agrawal, A. E. Abbadi, Ambuj K. Singh, Terence R. Smith","doi":"10.1109/SPDP.1996.570358","DOIUrl":"https://doi.org/10.1109/SPDP.1996.570358","url":null,"abstract":"Indexing multidimensional data is inherently complex leading to slow query processing. This behavior becomes more pronounced with the increase in database size and/or number of dimensions. In this paper we address this issue by processing an index structure in parallel. First, we study different ways of partitioning an index structure. We then propose efficient algorithms for processing each query in parallel on the index structure. Using these strategies, we parallelized two multidimensional index structures-R* and LIB and evaluated the performance gains for the Gazetteer and the Catalog data of the Alexandria Digital Library on the Meiko CS-2.","PeriodicalId":360478,"journal":{"name":"Proceedings of SPDP '96: 8th IEEE Symposium on Parallel and Distributed Processing","volume":"10 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"1996-10-23","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"128441463","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Comparison of two storage models in data-driven multithreaded architectures","authors":"M. Annavaram, W. Najjar","doi":"10.1109/SPDP.1996.570324","DOIUrl":"https://doi.org/10.1109/SPDP.1996.570324","url":null,"abstract":"Multithreaded execution models attempt to combine some aspects of dataflow-like execution with von Neumann model execution, with the objective of masking the latency of inter-processor communications and remote memory accesses in multiprocessors. An important issue in the analysis and evaluation of multithreaded execution is the design and performance of the storage hierarchy. Because of the sequential execution of threads, the locality of access within an executing thread can be exploited using registers and cache. At the inter-thread level, however, the locality of accesses to memory and its effect on the cache is not yet well understood. Two storage hierarchy models, that attempt to capture and exploit this locality, are described and evaluated in this paper.","PeriodicalId":360478,"journal":{"name":"Proceedings of SPDP '96: 8th IEEE Symposium on Parallel and Distributed Processing","volume":"52 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"1996-10-23","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"116916589","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Scheduling of multiprocessor tasks for numerical applications","authors":"T. Rauber, G. Rünger","doi":"10.1109/SPDP.1996.570371","DOIUrl":"https://doi.org/10.1109/SPDP.1996.570371","url":null,"abstract":"The authors investigate the efficient implementation of algorithms with a two-level parallelism on distributed memory machines. They consider parallel specifications consisting of an upper level of multiprocessor tasks each of which having an internal structure of uni-processor tasks. To achieve an optimal parallel execution time, the parallel execution of such a program requires an optimal scheduling of the multiprocessor tasks and an appropriate treatment of uni-processor tasks. In particular they consider an important class of parallel programs that are generated within a specific parallel programming model designing group-SPMD programs for scientific computing. They show how the costs of data redistributions between M-tasks can be taken into consideration and how the special structure of the resulting program can be exploited by using a simple approximation algorithm with a provable good performance.","PeriodicalId":360478,"journal":{"name":"Proceedings of SPDP '96: 8th IEEE Symposium on Parallel and Distributed Processing","volume":"79 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"1996-10-23","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"117212068","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"The APHID parallel /spl alpha//spl beta/ search algorithm","authors":"M. Brockington, J. Schaeffer","doi":"10.1109/SPDP.1996.570365","DOIUrl":"https://doi.org/10.1109/SPDP.1996.570365","url":null,"abstract":"The paper introduces the APHID (Asynchronous Parallel Hierarchical Iterative Deepening) game-tree search algorithm. APHID represents a departure from the approaches used in practice. Instead of parallelism based on the minimal search tree, APHID uses a truncated game-tree and all of the leaves of that tree are searched in parallel. APHID has been programmed as an easy to implement, game-independent /spl alpha//spl beta/ library, and has been tested on several game-playing programs. Results for an Othello program are presented. The algorithm yields good parallel performance on a network of workstations, without using a shared transposition table.","PeriodicalId":360478,"journal":{"name":"Proceedings of SPDP '96: 8th IEEE Symposium on Parallel and Distributed Processing","volume":"459 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"1996-10-23","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"115623039","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Bitwise aggregate networks","authors":"R. Hoare, H. Dietz, T. Mattox, Soohong P. Kim","doi":"10.1109/SPDP.1996.570348","DOIUrl":"https://doi.org/10.1109/SPDP.1996.570348","url":null,"abstract":"Typical communication networks for parallel processing are based on sending data from one processor to one, or all, of the other processors. Using such a network, many simple operations that require information from every processor requires many point-to-point or broadcast communications. These aggregate operations can be as simple as a barrier synchronization or as complex as an arithmetic reduction. In this paper we discuss a class of networks that directly implement a wide range of aggregate operations. These networks are capable of performing aggregate operations in a single communication operation using only simple bitwise combining logic in a trivially scalable tree configuration.","PeriodicalId":360478,"journal":{"name":"Proceedings of SPDP '96: 8th IEEE Symposium on Parallel and Distributed Processing","volume":"2 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"1996-10-23","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"122747656","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Simulated annealing applied to multicomputer task allocation and processor specification","authors":"James E. Beck, D. Siewiorek","doi":"10.1109/SPDP.1996.570339","DOIUrl":"https://doi.org/10.1109/SPDP.1996.570339","url":null,"abstract":"This paper considers the design problems of processor specification and task allocation for embedded computer systems. A partitioning-based representation is proposed that allows these problems to be solved concurrently. An algorithm based on this representation is described that utilizes simulated annealing coupled with a heuristic processor specification technique. This algorithm, named SA2, is compared against three baseline algorithms on a combination of real and synthetic test cases with respect to two figures of merit: hardware cost and run-time. The real test cases are based on commercially developed automotive electronic applications and the baseline algorithms represent a mixture of heuristic approaches with varying degrees of sophistication. For all test cases, SA2 is found to generate near optimal solutions, and the relative trade-off between solution quality and run-time exhibited by the algorithms is quantified and analyzed.","PeriodicalId":360478,"journal":{"name":"Proceedings of SPDP '96: 8th IEEE Symposium on Parallel and Distributed Processing","volume":"47 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"1996-10-23","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"131817922","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"A bulk-synchronous parallel library implementation for the BBN butterfly GP1000","authors":"M. Goudreau, Eric D. Root","doi":"10.1109/SPDP.1996.570346","DOIUrl":"https://doi.org/10.1109/SPDP.1996.570346","url":null,"abstract":"One of the fundamental goals of parallel computing is to develop a framework that will support portable and efficient application programs. The Bulk-Synchronous Parallel (BSP) model was proposed to help achieve this goal. The BSP model is intended to be a \"unifying model\"-it addresses both software and hardware issues by allowing theoretical analysis to coexist with practical physical implementations. For several years the BSP model has been supported mainly by theoretical results. Recent experiments, however, have begun to demonstrate the practicality of the model for real architectures running real applications. The goal of this paper is to describe the methodology used to construct an efficient BSP library on the BBN Butterfly GP1000. Our results are relevant for BSP library implementations on shared-memory systems in general and for NUMA (nonuniform m-memory-access) machines in particular.","PeriodicalId":360478,"journal":{"name":"Proceedings of SPDP '96: 8th IEEE Symposium on Parallel and Distributed Processing","volume":"9 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"1996-10-23","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"116591389","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Generalized parallel selection in sorted matrices","authors":"Hong Shen","doi":"10.1109/SPDP.1996.570345","DOIUrl":"https://doi.org/10.1109/SPDP.1996.570345","url":null,"abstract":"This paper presents a parallel algorithm running in time O(log m log* m(log log m+log(n/m))) time on an EREW PRAM with O(m/(log m log* m)) processors for the problem of selection in an m/spl times/n matrix with sorted rows and columns, m/spl les/n. Our algorithm generalizes the result of Sarnath and He (1992) for selection in a sorted matrix of equal dimensions, and thus answers the open question they posted. The algorithm is work-optimal when n/spl ges/m log m, and near optimal within O(log log m) factor otherwise. We show that our algorithm can be generalized to solve the selection problem on a set of sorted matrices of arbitrary dimensions.","PeriodicalId":360478,"journal":{"name":"Proceedings of SPDP '96: 8th IEEE Symposium on Parallel and Distributed Processing","volume":"5 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"1996-10-23","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"132743334","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"On staggered checkpointing","authors":"N. Vaidya","doi":"10.1109/SPDP.1996.570386","DOIUrl":"https://doi.org/10.1109/SPDP.1996.570386","url":null,"abstract":"A consistent checkpointing algorithm serves a consistent view of a distributed application's state on stable storage. The traditional consistent checkpointing algorithms require different processes to save their state at about the same time. This causes contention for the stable storage, potentially resulting in large overheads. Staggering the checkpoints taken by various processes can reduce checkpoint overhead. The paper presents a simple approach to arbitrarily stagger the checkpoints. The approach requires that the processes take consistent logical checkpoints, as compared to consistent physical checkpoints enforced by existing algorithms. Experimental results on nCube-2 are presented.","PeriodicalId":360478,"journal":{"name":"Proceedings of SPDP '96: 8th IEEE Symposium on Parallel and Distributed Processing","volume":"244 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"1996-10-23","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"124689778","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}