{"title":"The Augmented Composite Banyan Network","authors":"Hyoung-Il Lee, S. Seo, T. Feng","doi":"10.1109/HIPC.1998.738000","DOIUrl":"https://doi.org/10.1109/HIPC.1998.738000","url":null,"abstract":"A new multipath multistage interconnection network called the Augmented Composite Banyan Network (ACBN) is proposed. The ACBN is created by adding a link to each SE of the Composite Banyan Network (CBN), which is a multipath network with at least two disjoint paths and was originally proposed in (Seo and Feng, 1995). Therefore, the basic building blocks in the ACBN are 4/spl times/4 SEs with log/sub 2/N stages. The ACBN inherits the favorable features of the CBN such as regularity, symmetry and easy rerouting capability under faults and conflicts. The ACBN also has an efficient and fast control algorithm that can easily generate a primary routing tag and alternative routing tags as in the CBN. A major improvement of the ACBN over the CBN is the higher connectivity with four disjoint paths between any source and destination pair. Moreover the ACBN can generate alternative routing tags in a much simpler way i.e., by a simple binary operation not by the conversion table as in the CBN. To compare the ACBN with other networks in connectivity, we introduce the definition of a degree of connectivity as a new connectivity measure function. The comparison results show that due to high connectivity the ACBN can resolve more random connection requests than other networks by using only a small amount of additional hardware.","PeriodicalId":175528,"journal":{"name":"Proceedings. Fifth International Conference on High Performance Computing (Cat. No. 98EX238)","volume":"5 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"1998-12-17","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"116976015","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"A parallel skeletonization algorithm and its VLSI architecture","authors":"N. Sudha, Sukumar Nandi","doi":"10.1109/HIPC.1998.737972","DOIUrl":"https://doi.org/10.1109/HIPC.1998.737972","url":null,"abstract":"This paper presents a new algorithm to extract the skeleton and its Euclidean distance values from a binary image. A VLSI implementation of the algorithm in a locally connected cellular array is also given. The algorithm runs in O(n) time for an image of size n/spl times/n. The extracted skeleton reconstructs the objects in the image exactly.","PeriodicalId":175528,"journal":{"name":"Proceedings. Fifth International Conference on High Performance Computing (Cat. No. 98EX238)","volume":"20 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"1998-12-17","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"125545222","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Apportioning: a technique for efficient reachability analysis of concurrent object-oriented programs","authors":"S. Iyer, S. Ramesh","doi":"10.1109/HIPC.1998.737980","DOIUrl":"https://doi.org/10.1109/HIPC.1998.737980","url":null,"abstract":"The object-oriented paradigm has been found to be useful for the construction of large and complex concurrent systems. Reachability analysis is an important and well-known tool for static (pre-run-time) analysis of concurrent programs. However, direct application of traditional reachability analysis to concurrent object-oriented programs has many problems, such as incomplete analysis for reusable classes (not safe) and increased computational complexity (not efficient). We have proposed a novel technique called apportioning, for effective reachability analysis of concurrent object-oriented programs, that integrates the techniques of abstraction (considering a reduced representation of the system) and partitioning (dividing the system into smaller units). The given program is apportioned into a reduced version of each of its classes, as well as a reduced version of the program. The error to be checked is also decomposed into appropriate sub-properties for checking in the reachability graphs corresponding to the apportioned program. We have developed a number of apportioning-based algorithms, having different degrees of safety and effectiveness. In this paper, we present the details of one of these algorithms.","PeriodicalId":175528,"journal":{"name":"Proceedings. Fifth International Conference on High Performance Computing (Cat. No. 98EX238)","volume":"19 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"1998-12-17","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"115013661","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Executing serializable transactions within a hard real-time database system","authors":"S. Bhalla","doi":"10.1109/HIPC.1998.738015","DOIUrl":"https://doi.org/10.1109/HIPC.1998.738015","url":null,"abstract":"A number of factors contribute to delays in transaction execution. In a real-time database system, delays due to deadlocks, data accessing, and transaction commit must reduce to enable a transaction to complete successfully. In this report, a model of transaction execution is presented that permits execution and commit of any hard real-time transaction. The execution proceeds without the incidence of deadlock, or other blocking delays such as, due to denial of lock or commit approvals. The proposed technique is based on transaction classification and implementation of a precedence management scheme. The scheme provides an instantaneous execution opportunity to a serializable transaction within a hard real-time database system.","PeriodicalId":175528,"journal":{"name":"Proceedings. Fifth International Conference on High Performance Computing (Cat. No. 98EX238)","volume":"15 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"1998-12-17","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"129242255","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"A conservative parallel simulation algorithm for entity-oriented modeling","authors":"C. Lim, M. Low, Boon-Ping Gan","doi":"10.1109/HIPC.1998.738018","DOIUrl":"https://doi.org/10.1109/HIPC.1998.738018","url":null,"abstract":"Conservative parallel simulation protocols are such that each logical process (LP) in the simulation execute events only when it is certain that there will not be any time-order causality violation. In these conservative protocols, time-bound information for an LP is computed from the other LPs. We propose a variant form of conservative parallel simulation protocol in which the time-bound for an LP is computed from the existing events in the system. In a conservative protocol such as the Chandy-Misra-Bryant (CMB) protocol, it can be difficult for LP/sub i/ to guarantee different time-bounds to its next LPs. The time-bounds to some LPs may be more restrictive than necessary. If the time-bounds are provided by the entities (events) going through the system (as in our proposed algorithm), an event E at LP/sub i/ can independently supply different time-bounds only to those LPs which may receive events generated by E. We describe the algorithm, outline a proof of its correctness and discuss its possible strengths and weaknesses.","PeriodicalId":175528,"journal":{"name":"Proceedings. Fifth International Conference on High Performance Computing (Cat. No. 98EX238)","volume":"39 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"1998-12-17","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"116479063","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"An improved parallel disk scheduling algorithm","authors":"M. Kallahalla, P. Varman","doi":"10.1109/HIPC.1998.738012","DOIUrl":"https://doi.org/10.1109/HIPC.1998.738012","url":null,"abstract":"We address the problems of prefetching and I/O scheduling for read-once reference strings in a parallel I/O system. Read-once reference strings, in which each block is accessed exactly once, arise naturally in applications like databases and video retrieval. Using the standard parallel disk model with D disks and a shared I/O buffer of size M, we present a novel algorithm, red-black prefetching (RBP), for parallel I/O scheduling. The number of parallel I/Os performed by RBP is within 0(D/sup 1/3/) of the minimum possible. Algorithm RBP is easy to implement and requires computation time linear in the length of the reference string. Through simulation experiments we validated the benefits of RBP over simple greedy prefetching.","PeriodicalId":175528,"journal":{"name":"Proceedings. Fifth International Conference on High Performance Computing (Cat. No. 98EX238)","volume":"40 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"1998-12-17","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"125657555","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"New number representation and conversion techniques on reconfigurable mesh","authors":"A. Bertossi, A. Mei","doi":"10.1109/HIPC.1998.737964","DOIUrl":"https://doi.org/10.1109/HIPC.1998.737964","url":null,"abstract":"Several new number representations based on the residue number system are presented which use the smallest prime numbers as moduli and are suited for parallel computations on a reconfigurable mesh architecture. It is shown how to convert in O(1) time any integer ranging between 0 and n-1, from any commonly used representation to any new representation proposed in the paper (and vice versa) using an n/spl times/O(log/sup 2/n/log log n) reconfigurable mesh. In particular, some of the previously known conversion techniques are improved. Moreover, as a by product, it is shown how to compute in O(1) time the prefix sums of n bits improving previously known results. Applications to the summation and prefix sums of N h-bit integers are also considered. The summation and the prefix sums can be computed in O(1) time using O(h log N+log/sup 2/N/log log N)/spl times/Nh and O(h/sup 2/+log/sup 2/ N/log(h+log N))/spl times/O(N(h+log N)) reconfigurable meshes, respectively, improving all previously known results for most values of h including, for instance, h=O(log N).","PeriodicalId":175528,"journal":{"name":"Proceedings. Fifth International Conference on High Performance Computing (Cat. No. 98EX238)","volume":"9 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"1998-12-17","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"132992653","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Processor allocation using user directives in mesh-connected multicomputer systems","authors":"Chung-Yen Chang, P. Mohapatra","doi":"10.1109/HIPC.1998.738002","DOIUrl":"https://doi.org/10.1109/HIPC.1998.738002","url":null,"abstract":"Contemporary processor allocation schemes for multicomputers suffer from a fragmentation problem which causes underutilization of the processing nodes. The RSR and ANCA schemes, based on the concepts of size-reduction and non-contiguous allocation, show considerable performance improvement. However the penalties associated with these schemes limit their usage in environments with memory-bounded or communication-intensive jobs. We propose to use simple directives provided by the users to apply the appropriate allocation scheme and thus maximize the performance benefit. We present such a hybrid processor allocation scheme which combines the previously proposed conventional allocation, RSR and ANCA schemes. The hybrid allocation scheme is evaluated via extensive simulation. It outperforms the conventional allocation scheme and can be implemented easily while maximizing the resource utilizations.","PeriodicalId":175528,"journal":{"name":"Proceedings. Fifth International Conference on High Performance Computing (Cat. No. 98EX238)","volume":"2 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"1998-12-17","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"130138150","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Efficient algorithms for delay-bounded minimum cost path problem in communication networks","authors":"G. Kumar, Nishit Narang, C. Ravikumar","doi":"10.1109/HIPC.1998.737982","DOIUrl":"https://doi.org/10.1109/HIPC.1998.737982","url":null,"abstract":"As the amount of data transmitted over a network increases and high bandwidth applications requiring point to multipoint communications like videoconferencing, distributed database management or cooperative work become widespread, it becomes very important to optimize network resources. One such optimization is multicast tree generation. The problem of generating a minimum cost multicast tree given the network topology and costs associated with the connecting links can be modelled as a Steiner tree problem which is NP-complete. Much work has been done in the direction of obtaining near-optimal multicast trees when the objective is only to minimize the cost. However, many real time applications such as videoconferencing require that data be sent within prespecified delay limits in order to avoid problems such as anachronism and lack of synchronization. We deal with the delay-bounded cost-optimal multicast tree (DBCPAT) generation problem. Specifically, we discuss a closely related problem which is to find a delay-bounded cost-optimal path (DBCP) between a specified source and destination node. Such a path can be used as a starting point to solve the DBCMT. We present an exact solution to the DBCP which is based on the branch-and-bound paradigm. We also propose a heuristic technique to solve the DBCP using the principle of evolutionary computing. The results obtained using the two techniques are compared for several large networks.","PeriodicalId":175528,"journal":{"name":"Proceedings. Fifth International Conference on High Performance Computing (Cat. No. 98EX238)","volume":"39 4","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"1998-12-17","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"132393676","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Message passing support on StarT-Voyager","authors":"B. S. Ang, Derek Chiou, L. Rudolph, Arvind","doi":"10.1109/HIPC.1998.737993","DOIUrl":"https://doi.org/10.1109/HIPC.1998.737993","url":null,"abstract":"No single message passing mechanism can efficiently support all types of communication that commonly occur in most parallel or distributed programs. MIT's StarT-Voyager, a hybrid message passing/shared memory parallel machine, provides four message passing mechanisms to achieve high performance over a wide spectrum of communication types and sizes. Hardware and address translation enforced protection allows direct user-level access to message passing facilities in a multiuser environment. StarT-Voyager's protection scheme improves upon past designs by not requiring strictly synchronized gang-scheduling, and by supporting non-monolithic protection domains. To minimize the development effort and cost, the machine is designed to use unmodified commercial PowerPC 604-based SMP systems as the building block. A Network End-point Subsystem (NES) card which plugs into one of each SMP's processor card slots provides the interface to Arctic, a low-latency, high-bandwidth network developed at MIT. This paper describes StarT-Voyager's message passing mechanisms and their predicted performance.","PeriodicalId":175528,"journal":{"name":"Proceedings. Fifth International Conference on High Performance Computing (Cat. No. 98EX238)","volume":"10 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"1998-12-17","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"131347310","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}