{"title":"A new parallel algorithm for breadth-first search on interval graphs","authors":"Sajal K. Das, Calvin Ching-Yuen Chen","doi":"10.1109/IPPS.1992.223055","DOIUrl":"https://doi.org/10.1109/IPPS.1992.223055","url":null,"abstract":"The authors design an efficient parallel algorithm for constructing a breadth-first spanning tree of an interval graph. Their novel approach is based on elegantly capturing the structure of a given collection of intervals. This structure reveals important properties of the corresponding interval graph, and is found to be instrumental in solving many other problems including the computation of a breadth-depth spanning tree, which they report for the first time. The algorithm requires O(logn) time employing O(n) processors on the EREW PRAM model.<<ETX>>","PeriodicalId":340070,"journal":{"name":"Proceedings Sixth International Parallel Processing Symposium","volume":"21 44","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"1992-03-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"132545490","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"The interplay between granularity, performance and availability in a replicated Linda tuple space","authors":"S. Kambhatla, J. Walpole","doi":"10.1109/IPPS.1992.222976","DOIUrl":"https://doi.org/10.1109/IPPS.1992.222976","url":null,"abstract":"Replication is a common method for increasing the availability of data in a distributed environment. The authors' interest is in the application of replication techniques in the domain of parallel processing. They explore the issues concerning degree of replication and granularity in the context of a distributed and highly available Linda tuple space. In particular, they study the performance effects of varying the number of replicas and the granularities of replication and concurrency control. Traditionally, when using replication in databases, the granularity of replication and that of concurrency control have been the same (at the file level (D.K. Gifford, 1979), for example). This is not an inherent requirement however. The authors show by detailed simulation of a replicated Linda tuple space that it is useful to separate the two granularities and that it is an important design issue especially in parallel processing systems.<<ETX>>","PeriodicalId":340070,"journal":{"name":"Proceedings Sixth International Parallel Processing Symposium","volume":"6 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"1992-03-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"124624638","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"A paradigm for distributed deadlock avoidance in multicomputer networks","authors":"J. P. Samantarai","doi":"10.1109/IPPS.1992.222993","DOIUrl":"https://doi.org/10.1109/IPPS.1992.222993","url":null,"abstract":"A paradigm for avoiding buffer deadlock in point-to-point multicomputer networks is presented which is ideal for today's high connectivity, load sharing networks. Unlike the traditional resource ordering principle, this paradigm not only allows unrestricted routing but uses the existence of multiple paths to its direct advantage. Deadlock is avoided entirely using exchange buffers which are not used for message queues, thus eliminating queueing overhead. The paradigm is topology-independent, imposes no routing restrictions, and uses states of neighboring links only, so that it can be built into link level protocol, providing unrestricted deadlock-free routing, while operating transparent to any fault-tolerant topology-specific routing algorithm.<<ETX>>","PeriodicalId":340070,"journal":{"name":"Proceedings Sixth International Parallel Processing Symposium","volume":"59 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"1992-03-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"124783238","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Conflict-free scheduling of nested loop algorithms on lower dimensional processor arrays","authors":"Zhenhui Yang, Weijia Shang, J. Fortes","doi":"10.1109/IPPS.1992.223054","DOIUrl":"https://doi.org/10.1109/IPPS.1992.223054","url":null,"abstract":"In practice, it is interesting to map n-dimensional algorithms, or algorithms with n nested loops, onto (k-1)-dimensional arrays where k<n. The paper considers some open problems in a previous work by Shang and Fortes (1990). A procedure is proposed to test if or not a given mapping has computational conflicts and a lower bound on the total execution time is provided. Based on the testing procedure and the lower bound, the complexity and the optimality of the optimization procedure in the previous work is improved. The integer programming formulation is also discussed and used to find the optimal time mapping for the 5-dimensional bit level matrix multiplication algorithm into a 2-dimensional bit level processor array.<<ETX>>","PeriodicalId":340070,"journal":{"name":"Proceedings Sixth International Parallel Processing Symposium","volume":"27 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"1992-03-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"125498940","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"IDPS: a massively parallel heuristic search algorithm","authors":"A. Mahanti, C. J. Daniels","doi":"10.1109/IPPS.1992.223042","DOIUrl":"https://doi.org/10.1109/IPPS.1992.223042","url":null,"abstract":"Presents an efficient SIMD parallel algorithm, called IDPS (iterative deepening parallel search). The performance of four variants of IDPS is studied through experiments conducted on the well known test-bed problem for search algorithms, the 15-puzzle. During the experiments, data were gathered under two different static load-balancing schemes. Under the first scheme, an average efficiency of approximately /sup 3///sub 4/ was obtained for 4 K, 8 K, and 16 K processors. Under the second scheme, average efficiencies of 0.92 and 0.76 were obtained for 8 K and 16 K processors, respectively. It is also shown that for admissible search, linear or superlinear average speedup can be obtained for problems of significant size.<<ETX>>","PeriodicalId":340070,"journal":{"name":"Proceedings Sixth International Parallel Processing Symposium","volume":"21 5","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"1992-03-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"120891401","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Exploiting concurrency among tasks in partitionable parallel processing systems","authors":"W. Nation, A. A. Maciejewski, H. Siegel","doi":"10.1109/IPPS.1992.223076","DOIUrl":"https://doi.org/10.1109/IPPS.1992.223076","url":null,"abstract":"One benefit of partitionable parallel processing systems is their ability to execute multiple, independent tasks simultaneously. Previous work has identified conditions such that, when there are k tasks to be processed, partitioning the system such that all k tasks are processed simultaneously results in a minimum overall execution time. An alternate condition is developed that provides additional insight into the effects of parallelism on execution time. This result, and previous results, however, assume that execution times are data independent. It is shown that data-dependent tasks do not necessarily execute faster when processed simultaneously even if the condition is met. A model is developed that provides for the possible variability of a task's execution time and is used in a new framework to study the problem of finding an optimal mapping for identical, independent data-dependent execution time tasks onto partitionable systems. Extension of this framework to situations where the k tasks are non-identical is discussed.<<ETX>>","PeriodicalId":340070,"journal":{"name":"Proceedings Sixth International Parallel Processing Symposium","volume":"1 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"1992-03-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"127074225","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Vector Hartley transform employing multiprocessors","authors":"R. Mahapatra, Akhilesh Kumar","doi":"10.1109/IPPS.1992.223038","DOIUrl":"https://doi.org/10.1109/IPPS.1992.223038","url":null,"abstract":"Many parallel implementations for signal processing transforms have already been reported. The implementation of Hou's FHT algorithm (1987) has been studied on three multiprocessor architectures (MPAs): multiprocessors connected through a shared bus; multiprocessors connected by an indirect binary n-cube multistage interconnection network and mesh connected multiprocessors. The article analyzes the performance of a vector Hartley transform algorithm on these MPAs.<<ETX>>","PeriodicalId":340070,"journal":{"name":"Proceedings Sixth International Parallel Processing Symposium","volume":"76 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"1992-03-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"127292330","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Multi-packet selection on mesh-connected processor arrays","authors":"D. Krizanc, L. Narayanan","doi":"10.1109/IPPS.1992.222999","DOIUrl":"https://doi.org/10.1109/IPPS.1992.222999","url":null,"abstract":"The authors show efficient, deterministic algorithms for selection on the mesh-connected processor array, in the case when there are several elements at every processor. In particular, on a p-processor mesh, with N>or=p elements, stored N/p at every processor, they show that selection can be performed in O(min(plog/sup N///sub p/, max(N/p/sup 2/3/, square root p))) communication steps. The best previously known results were based on sorting and required O(N/ square root p) communication steps, for N>or=p.<<ETX>>","PeriodicalId":340070,"journal":{"name":"Proceedings Sixth International Parallel Processing Symposium","volume":"13 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"1992-03-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"121974860","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Memory requirements to balance thus asymptotically full-speedup FFT computation on processor arrays","authors":"J. Shieh","doi":"10.1109/IPPS.1992.223045","DOIUrl":"https://doi.org/10.1109/IPPS.1992.223045","url":null,"abstract":"The paper proves that for a linearly-connected array of alpha processors or a mesh-connected array of alpha /sup 2/ processors, where each processor has computation bandwidth C, I/O bandwidth I and C/I=logm, Omega (m/sup alpha /) memory size is required in each processor to minimize the I/O requirement in balancing the FFT computation. Then it presents balanced FFT algorithms on these arrays to meet their memory size lower bounds. These algorithms are time optimal exhibiting full speedups.<<ETX>>","PeriodicalId":340070,"journal":{"name":"Proceedings Sixth International Parallel Processing Symposium","volume":"21 5","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"1992-03-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"132365036","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Bussed interconnection networks from trees","authors":"C. M. Fiduccia","doi":"10.1109/IPPS.1992.223015","DOIUrl":"https://doi.org/10.1109/IPPS.1992.223015","url":null,"abstract":"Pin limitations are a fundamental obstacle in the construction of massively parallel computers. The paper introduces a class of d-dimensional bussed hypercubes that can perform simultaneous bidirectional communication across any dimension using d+1, rather than 2d, ports per node. Each network Q/sub d/(T) is based on a tree T, which specifies the 'shape' of the busses, and can perform d(d+1)/2 permutations pi /sub ij/(x)=x(+)c/sub ij/ via a simple global command. This construction is then generalized to any d permutations II=( pi /sub 1/,. . ., pi /sub d/) of any set of nodes X. Given any edge-labeled directed tree T, whose kth arc is associated with the permutation pi /sub k/, a bussed network N(II,T) is constructed that can-in one clock tick-perform any of the O(d/sup 2/) permutations arising from the paths in the tree T.<<ETX>>","PeriodicalId":340070,"journal":{"name":"Proceedings Sixth International Parallel Processing Symposium","volume":"1 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"1992-03-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"129521002","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}