{"title":"Parallel implementation of divide-and-conquer algorithms on binary de Bruijn networks","authors":"Xiaoxiong Zhong, S. Rajopadhye, V. Lo","doi":"10.1109/IPPS.1992.223064","DOIUrl":"https://doi.org/10.1109/IPPS.1992.223064","url":null,"abstract":"Studies the problem of parallel implementation of divide-and-conquer algorithms on binary de Bruijn network using a temporal binomial tree (rather than the usual binary tree) computation structure. Two cases of message volumes are considered: (i) uniform, and (ii) logarithmically decreasing (increasing) weights. A single mapping is proposed for both cases. It has average extra dilation 1 and is communication link contention-free. A lower bound for the total extra dilation of any mapping from uniform-weighted binomial tree to an arbitrary degree-4 network is also developed to show that the mapping is asymptotically optimal with respective to the average extra dilation. The implementation is well suited to a binary de Bruijn network with a wormhole or circuit switching communication scheme.<<ETX>>","PeriodicalId":340070,"journal":{"name":"Proceedings Sixth International Parallel Processing Symposium","volume":"86 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"1992-03-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"115864167","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Distributed consensus in semi-synchronous systems","authors":"P. Berman, A. Bharali","doi":"10.1109/IPPS.1992.222994","DOIUrl":"https://doi.org/10.1109/IPPS.1992.222994","url":null,"abstract":"The Distributed consensus problem assumes that all processors in the system have some initial values; the goal is to make all non-faulty processors agree on one of these values. This paper investigates the time needed to reach consensus in a partially synchronous model with omission failures. In this model, the processors have no direct knowledge about time, but the time between consecutive steps of each processor is always between two known constants c/sub 1/ and c/sub 2/; the ratio C=/sup c2///sub c1/ measures the timing uncertainty in the system. Moreover, messages are delivered within time d. This paper provides an improved protocol for the above problem. When the majority of the processors are fault-free, the protocol achieves consensus in time 3( phi +1)d+Cd, where phi is the actual number of faults in a specific execution of the protocol. This allows an increase in efficiency up to 25% over the existing protocol which requires time 4( phi +1)d+Cd.<<ETX>>","PeriodicalId":340070,"journal":{"name":"Proceedings Sixth International Parallel Processing Symposium","volume":"10 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"1992-03-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"126215256","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Parallel implementation of the auction algorithm on the Intel hypercube","authors":"N. Bagherzadeh, K. Hawk","doi":"10.1109/IPPS.1992.223005","DOIUrl":"https://doi.org/10.1109/IPPS.1992.223005","url":null,"abstract":"The authors present their experience in executing the auction algorithm on an iPSC/860 hypercube multiprocessor. They show the performance of the algorithm under synchronous and asynchronous computation models. In order to reduce the number of iterations for this algorithm and effectively increase the inherent parallelism in the auction algorithm, they propose and test a new technique called gamma -scaling.<<ETX>>","PeriodicalId":340070,"journal":{"name":"Proceedings Sixth International Parallel Processing Symposium","volume":"39 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"1992-03-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"126385156","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Fault-tolerant multiprocessor system routing using incomplete diagnostic information","authors":"D. Blough, S. Najand","doi":"10.1109/IPPS.1992.223013","DOIUrl":"https://doi.org/10.1109/IPPS.1992.223013","url":null,"abstract":"Fault-tolerant routing algorithms in multiprocessor systems utilize diagnostic information in selecting paths for messages. In many situations, only incomplete, or partial, diagnostic information is available for this purpose. The authors present algorithms for achieving two forms of diagnosis, known as k-reachability diagnosis and k-neighborhood diagnosis which provide partial diagnostic information. They compare, both analytically and through experiments conducted on an Intel iPSC/2 hypercube the performance and overhead of these two algorithms. They also present a routing algorithm that successfully routes messages between connected non-faulty nodes in systems of arbitrary topology containing an arbitrary number of faults. The performance of the algorithm is shown to be optimal when k=n-1 and within a factor of two of optimal, in the worst case, when k=1.<<ETX>>","PeriodicalId":340070,"journal":{"name":"Proceedings Sixth International Parallel Processing Symposium","volume":"48 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"1992-03-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"125993754","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Optimal algorithms for the vertex updating problem of a minimum spanning tree","authors":"Donald B. Johnson, P. Metaxas","doi":"10.1109/IPPS.1992.223028","DOIUrl":"https://doi.org/10.1109/IPPS.1992.223028","url":null,"abstract":"The vertex updating problem for a minimum spanning tree (MST) is defined as follows: Given a graph G=(V,E/sub G/) and its MST T, update T when a new vertex z is introduced along with weighted edges that connect z with the vertices of G. The authors present a set of rules that, together with a valid tree-contraction schedule are used to produce simple optimal parallel algorithms that run in O(log n) parallel time using n/lgn EREW PRAMs where n= mod V mod . These rules can also be used to derive simple linear-time sequential algorithms for the same problem. It is also shown how this solution can be used to solve the multiple vertex updating problem: Update a given MST when k new vertices are introduced simultaneously. This problem is solved in O(lgk.lgn) parallel time using /sub lgk.lgn//sup k.n/ EREW PRAM processors.<<ETX>>","PeriodicalId":340070,"journal":{"name":"Proceedings Sixth International Parallel Processing Symposium","volume":"16 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"1992-03-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"125416808","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"An improved multiple-path deadlock-free routing algorithm in binary hypercubes","authors":"Qiang Li","doi":"10.1109/IPPS.1992.223000","DOIUrl":"https://doi.org/10.1109/IPPS.1992.223000","url":null,"abstract":"This paper presents a multiple-path deadlock-free routing algorithm in direct binary hypercubes which is an improved version of a previously published algorithm by the author (1991). Between two nodes of distance k, the previous algorithm provides k disjoint paths in one direction and one path in the other. The direction with one path is a performance bottleneck. The new algorithm adds one more disjoint path to the narrow direction using buffer management technique, and preserves the deadlock-free property. Although only one path is added, simulation results presented in this paper show a significant performance improvement since the added path almost doubles the capacity of the bottleneck.<<ETX>>","PeriodicalId":340070,"journal":{"name":"Proceedings Sixth International Parallel Processing Symposium","volume":"35 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"1992-03-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"133375114","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"A more efficient message-optimal algorithm for distributed termination detection","authors":"T. Lai, Y. Tseng, Xuefeng Dong","doi":"10.1109/IPPS.1992.222991","DOIUrl":"https://doi.org/10.1109/IPPS.1992.222991","url":null,"abstract":"Termination detection is a fundamental problem in distributed computing. Many algorithms have been proposed, but only the S. Chandrasekaran and S. Venkatesan (CV) algorithm (1990) is known to be optimal in worst-case message complexity. This optimal algorithm, however, has several undesirable properties. First, it always requires M'+2* mod E mod +n-1 control messages, whether it is worst case or best case, where M' is the number of basic messages issued by the underlying computation after the algorithm starts, mod E mod is the number of channels in the system, and n is the number of processes. Second, its worst-case detection delay is O(M'). In a message-intensive computation, that might not be tolerable. Third, the maximum amount of space needed by each process is O(M'), a quantity not known at compile time, making it necessary to use the more expensive dynamic memory allocation. Last, it works only for FIFO channels. This paper remedies these drawbacks, while keeping its strength. The authors propose an algorithm that requires M'+2(n-1) control messages in the worst case, but much fewer on the average, and in the best case, it uses only 2(n-1) control messages, no matter how large M' is.<<ETX>>","PeriodicalId":340070,"journal":{"name":"Proceedings Sixth International Parallel Processing Symposium","volume":"48 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"1992-03-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"133386710","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"A software tool for cellular mapping of discrete unitary transforms","authors":"G. Miel, E. Yfantis","doi":"10.1109/IPPS.1992.223029","DOIUrl":"https://doi.org/10.1109/IPPS.1992.223029","url":null,"abstract":"The paper describes a software tool that facilitates mapping onto array processors of a wide class of unitary transforms. The mapping formalism of the tool depends on matrix factorizations combined with abstract constructs that link the linear concepts to a model of the array's architecture. A prototype design of the tool is graphics-based and user-driven.<<ETX>>","PeriodicalId":340070,"journal":{"name":"Proceedings Sixth International Parallel Processing Symposium","volume":"4 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"1992-03-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"129222492","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Quadtree building algorithms on an SIMD hypercube","authors":"O. Ibarra, M. Kim","doi":"10.1109/IPPS.1992.223077","DOIUrl":"https://doi.org/10.1109/IPPS.1992.223077","url":null,"abstract":"Presents O(log n) time SIMD hypercube algorithms for transforming binary images to linear quadtrees and vice versa, where n is the size of the images as well as the number of hypercube nodes. The quadtree building algorithm, which generates the locational codes in preorder, is an improvement of a recently reported algorithm that runs in O(log/sup 2/n) time. The authors also give an optimal linear quadtree building algorithm which runs in T(n) time using n/sup 2//T(n) processors for log n<or=T(n)<or=n/sup 2/. The algorithm is optimal in the sense that the product of time and number of processors is asymptotically the same as the optimal sequential time which is O(n/sup 2/). For this algorithm we assume that the input binary image is divided into blocks and loaded in a shuffled row major ordered hypercube. The algorithm uses the procedures for the quadtree building algorithm developed for the case when the number of hypercube nodes is equal to the number of pixels in the binary image.<<ETX>>","PeriodicalId":340070,"journal":{"name":"Proceedings Sixth International Parallel Processing Symposium","volume":"12 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"1992-03-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"123962633","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Optimal aspect ratio and number of separable row/column buses for mesh-connected parallel computers","authors":"M. Serrano, B. Parhami","doi":"10.1109/IPPS.1992.223023","DOIUrl":"https://doi.org/10.1109/IPPS.1992.223023","url":null,"abstract":"A two-dimensional mesh of PEs with separable row and column buses has been shown to be quite effective for semigroup, prefix, and a wide class of other parallel computations. The authors show how semigroup and prefix computations can be performed with the same asymptotic time complexity on meshes having separable buses for a subset of rows and columns. They find that with this basic arrangement, square grids are not optimal but that a hierarchical method of synthesizing large meshes builds optimal square meshes from rectangular submeshes. The time-complexity results are shown to correspond to those previously published when certain parameters of the design are fixed at special values.<<ETX>>","PeriodicalId":340070,"journal":{"name":"Proceedings Sixth International Parallel Processing Symposium","volume":"32 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"1992-03-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"115971742","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}