{"title":"Assignment of ADT modules to processors","authors":"L. Welch","doi":"10.1109/IPPS.1992.223069","DOIUrl":"https://doi.org/10.1109/IPPS.1992.223069","url":null,"abstract":"The utilization of reusable software components can help to reduce the complexity of developing and maintaining parallel programs, but can lead to inefficiencies. The potential inefficiencies are addressed by providing a model of parallel execution (asynchronous remote procedure call, or ARPC) that not only speeds up programs, but also encourages the development of layered software by increasing parallelism in correspondence to increases in layering. The paper presents an efficient algorithm for assigning the reusable modules of a program to the processing elements of a parallel computer that supports ARPC. The objectives of the assignment algorithm are to permit maximum inter-module parallelism with the fewest possible PEs, and to prevent deadlock. The algorithm differs from previous solutions to the assignment problem in that the modules to be assigned are generic abstract data type modules, not procedures, tasks, or processes.<<ETX>>","PeriodicalId":340070,"journal":{"name":"Proceedings Sixth International Parallel Processing Symposium","volume":"28 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"1992-03-23","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"134477941","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Parallel heap: improved and simplified","authors":"S. Prasad, N. Deo","doi":"10.1109/IPPS.1992.223004","DOIUrl":"https://doi.org/10.1109/IPPS.1992.223004","url":null,"abstract":"Describes a new updated version of the data structure parallel heap. Employing p processors, a parallel heap allows detections of Theta (p) highest-priority items and insertion of Theta (p) new items each in O(logn) time on an EREW PRAM where n is the size of the parallel heap. Furthermore, it can efficiently utilize processors in the range 1 through n. This version does not require dedicated maintenance processors, and performs insertion and deletion in place.<<ETX>>","PeriodicalId":340070,"journal":{"name":"Proceedings Sixth International Parallel Processing Symposium","volume":"75 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"1992-03-23","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"122544401","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"A distributed data-balanced dictionary based on the B-link tree","authors":"T. Johnson, A. Colbrook","doi":"10.1109/IPPS.1992.223026","DOIUrl":"https://doi.org/10.1109/IPPS.1992.223026","url":null,"abstract":"Many concurrent dictionary data structures have been proposed, but usually in the context of shared memory multiprocessors. The paper presents an algorithm for a concurrent distributed B-tree that can be implemented on message passing parallel computers. This distributed B-tree (the dB-tree) replicates the interior nodes in order to improve parallelism and reduce message passing. It is shown how the dB-tree algorithm can be used to build an efficient, highly parallel, data-balanced distributed dictionary, the dE-tree.<<ETX>>","PeriodicalId":340070,"journal":{"name":"Proceedings Sixth International Parallel Processing Symposium","volume":"475 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"1992-03-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"123044267","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Nicholas Giolmas, D. Watson, D. Chelberg, H. Siegel
{"title":"A parallel approach to hybrid range image segmentation","authors":"Nicholas Giolmas, D. Watson, D. Chelberg, H. Siegel","doi":"10.1109/IPPS.1992.223024","DOIUrl":"https://doi.org/10.1109/IPPS.1992.223024","url":null,"abstract":"Parallel processing methods are an attractive means to achieve significant speedup of computationally expensive image understanding algorithms, such as those applied to range images. Mixed-mode parallel systems are ideally suited to this area because of the flexibility in using the different modes of parallelism. The trade-offs of using different parallel modes are examined through the implementation of hybrid range segmentation operations, characteristic of a broad class of low level image processing algorithms. Alternative means of distributing data among the processing elements that achieve improved performance are considered. Results comparing different implementations on a single reconfigurable parallel processor. PASM, indicate some generally applicable guidelines for the effective parallelization of vision algorithms.<<ETX>>","PeriodicalId":340070,"journal":{"name":"Proceedings Sixth International Parallel Processing Symposium","volume":"82 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"1992-03-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"125899805","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Evaluation of connected component labeling algorithms on shared and distributed memory multiprocessors","authors":"A. Choudhary, R. Thakur","doi":"10.1109/IPPS.1992.223019","DOIUrl":"https://doi.org/10.1109/IPPS.1992.223019","url":null,"abstract":"Presents parallel implementations of connected component labeling for grey level images on the iPSC/2 E iPSC/860 hypercubes and on the Encore Multimax shared memory multiprocessor. Several partitioning and mapping strategies including multi-dimensional divide and conquer, block decomposition and scatter decomposition are used. Implementation results, performance evaluation and comparison for all the mapping strategies are reported.<<ETX>>","PeriodicalId":340070,"journal":{"name":"Proceedings Sixth International Parallel Processing Symposium","volume":"62 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"1992-03-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"123237813","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Design and analysis of fault-detecting and fault-locating schedules for computation DAGs","authors":"S. Yajnik, N. Jha","doi":"10.1109/IPPS.1992.223022","DOIUrl":"https://doi.org/10.1109/IPPS.1992.223022","url":null,"abstract":"The paper investigates issues concerning the construction of fault-detecting and fault-locating schedules for multiprocessor systems. It develops conditions for a schedule to be fault-detecting or fault-locating and further uses these conditions to propose schemes for construction of the schedules. Lower-bounds on the length of the schedules are calculated and for the special case of binary computation trees, it is shown that the schedules meet the lower-bounds in most cases. A method for actual fault diagnosis from the results of the fault-locating schedules for binary computation trees is also proposed.<<ETX>>","PeriodicalId":340070,"journal":{"name":"Proceedings Sixth International Parallel Processing Symposium","volume":"86 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"1992-03-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"123578484","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Two-state self-stabilizing algorithms","authors":"M. Flatebo, A. Datta","doi":"10.1109/IPPS.1992.223047","DOIUrl":"https://doi.org/10.1109/IPPS.1992.223047","url":null,"abstract":"A distributed system consists of a set of loosely connected state machines which do not share a global memory. All the possible global states of the system can be split up into legal and illegal states. A self-stabilizing system is a network of processors, which, when started from an arbitrary (and possibly illegal) initial state, always returns to a legal state in a finite number of steps. One issue in designing self-stabilizing algorithms is the number of state required by each machine. The paper presents algorithms which will be self-stabilizing while only requiring each machine in the network to have two states. Probability is used in some of the algorithms in order to make this possible. The algorithms are given along with correctness proofs.<<ETX>>","PeriodicalId":340070,"journal":{"name":"Proceedings Sixth International Parallel Processing Symposium","volume":"53 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"1992-03-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"126617128","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"The effects of communication overhead on the speedup of parallel 3-D finite element applications","authors":"V. Taylor, B. Nour-Omid, D. Messerschmitt","doi":"10.1109/IPPS.1992.222972","DOIUrl":"https://doi.org/10.1109/IPPS.1992.222972","url":null,"abstract":"The use of parallel processors for implementing the finite element method has made feasible the analyses of large applications, especially three-dimensional applications. The speedup, however, is limited by the interprocessor communication requirements. The authors analyze the effects of interprocessor communications on the resultant speedup of the parallel execution of regular three-dimensional finite element applications. They derive the speedup expressions for the hypercube and mesh topologies. These expressions can be used to analyze the results of different partitioning and mapping strategies.<<ETX>>","PeriodicalId":340070,"journal":{"name":"Proceedings Sixth International Parallel Processing Symposium","volume":"1 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"1992-03-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"115195825","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"A systolic algorithm and architecture for Galois field arithmetic","authors":"M. Kovač, N. Ranganathan, M. Varanasi","doi":"10.1109/IPPS.1992.223032","DOIUrl":"https://doi.org/10.1109/IPPS.1992.223032","url":null,"abstract":"Finite or Galois fields are used in numerous applications like error correcting codes, digital signal processing and cryptography. The design of efficient methods for Galois field arithmetic such as multiplication and division is critical for these applications. The paper presents a new algorithm for computing multiplication and division in GF(2/sup m/). A systolic architecture is described for implementing the algorithm which can produce a new result every clock cycle. The architecture can be realized as a VLSI chip that can yield a computational rate of 40 million multiplications/divisions per second.<<ETX>>","PeriodicalId":340070,"journal":{"name":"Proceedings Sixth International Parallel Processing Symposium","volume":"29 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"1992-03-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"122167718","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"The impact of wiring constraints on hierarchical network performance","authors":"W. Hsu, P. Yew","doi":"10.1109/IPPS.1992.222964","DOIUrl":"https://doi.org/10.1109/IPPS.1992.222964","url":null,"abstract":"A unified approach, incorporating architectural and packaging issues, is necessary in the design of high performance computer networks. Clustering enables the authors to exploit the physical hierarchy imposed by packaging. Previously the authors examined the clustering of hypercube networks within the context of wiring constraints (see 1991 Int. Conf. on Parallel Processing, Aug. 1991). The authors extend their earlier work to compare the performance of hypercubes and meshes. They consider two cost constraints, bisection width and package pinout, and examine flat and clustered meshes and hypercubes. They find that the relative performance of networks depends on the chosen wiring constraint, as well as system configuration and message granularity.<<ETX>>","PeriodicalId":340070,"journal":{"name":"Proceedings Sixth International Parallel Processing Symposium","volume":"11 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"1992-03-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"128378270","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}