{"title":"Design patterns for parallel computing using a network of processors","authors":"S. Siu, Ajit Singh","doi":"10.1109/HPDC.1997.626434","DOIUrl":"https://doi.org/10.1109/HPDC.1997.626434","url":null,"abstract":"High complexity of building parallel applications is often cited as one of the major impediments to the mainstream adoption of parallel computing. To deal with the complexity of software development, abstractions such as macros, functions, abstract data types, and objects are commonly employed by sequential as well as parallel programming models. This paper describes the concept of a design pattern for the development of parallel applications. A design pattern in our case describes a recurring parallel programming problem and a reusable solution to that problem. A design pattern is implemented as a reusable code skeleton for quick and reliable development of parallel applications. A parallel programming system, called DPnDP (Design Patterns and Distributed Processes), that employs such design patterns is described. In the past, parallel programming systems have allowed fast prototyping of parallel applications based on commonly occurring communication and synchronization structures. The uniqueness of our approach is in the use of a standard structure and interface for a design pattern. This has several important implications: first, design patterns can be defined and added to the system's library in an incremental manner without requiring any major modification of the system (extensibility). Second, customization of a parallel application is possible by mixing design patterns with low level parallel code resulting in a flexible and efficient parallel programming tool (flexibility). Also, a parallel design pattern can be parameterized to provide some variations in terms of structure and behavior.","PeriodicalId":243171,"journal":{"name":"Proceedings. The Sixth IEEE International Symposium on High Performance Distributed Computing (Cat. No.97TB100183)","volume":"255 ","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"1997-08-05","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"114057751","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Channel allocation methods for data dissemination in mobile computing environments","authors":"Wang-Chien Lee, Qinglong Hu, Lee","doi":"10.1109/HPDC.1997.626430","DOIUrl":"https://doi.org/10.1109/HPDC.1997.626430","url":null,"abstract":"We discuss several channel allocation methods for data dissemination in mobile computing systems. We suggest that the broadcast and on-demand channels have different access performance under different system parameters and that a mobile cell should use a combination of both to obtain optimal access time for a given workload and system parameters. We study the data access efficiency of three channel configurations: all channels are used as on-demand channels (exclusive on-demand); all channels are used for broadcast (exclusive broadcast); and some channels are on-demand channels and some are broadcast channels (hybrid). Simulations on obtaining the optimal channel allocation for lightly-loaded, medium-loaded, and heavy-loaded conditions is conducted and the result shows that an optimal channel allocation significantly improves the system performance.","PeriodicalId":243171,"journal":{"name":"Proceedings. The Sixth IEEE International Symposium on High Performance Distributed Computing (Cat. No.97TB100183)","volume":"151 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"1997-08-05","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"124210066","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Flexible general purpose communication primitives for distributed systems","authors":"R. Baldoni, R. Beraldi, R. Prakash","doi":"10.1109/HPDC.1997.626404","DOIUrl":"https://doi.org/10.1109/HPDC.1997.626404","url":null,"abstract":"This paper presents the slotted-FIFO communication mode that supports communication primitives for the entire spectrum of reliability and ordering requirements of distributed applications: FIFO as well as non-FIFO, and reliable as well as unreliable communication. Hence, the slotted-FIFO communication mode is suitable for multimedia applications, as well as non real-time distributed applications. As FIFO ordering is not required for all messages, message buffering requirements are considerably reduced. Also, message latencies are lower. We quantify such advantages by means of a simulation study. A low overhead protocol implementing slotted-FIFO communication is also presented. The protocol incurs a small resequencing cost.","PeriodicalId":243171,"journal":{"name":"Proceedings. The Sixth IEEE International Symposium on High Performance Distributed Computing (Cat. No.97TB100183)","volume":"74 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"1997-08-05","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"124730659","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Run-time support for scheduling parallel applications in heterogeneous NOWs","authors":"J. Weissman, Xin Zhao","doi":"10.1109/HPDC.1997.626442","DOIUrl":"https://doi.org/10.1109/HPDC.1997.626442","url":null,"abstract":"This paper describes the current state of Prophet-a system that provides run-time scheduling support for parallel applications in heterogeneous workstation networks. Prior work on Prophet demonstrated that scheduling SPMD applications could be effectively automated with excellent performance. Enhancements have been made to Prophet to broaden its use to other application types including parallel pipelines, and to make more effective use of dynamic system state information to further improve performance. The results indicate that both SPMD and parallel pipeline applications can be scheduled to produce reduced completion time by exploiting the application structure and run-time information.","PeriodicalId":243171,"journal":{"name":"Proceedings. The Sixth IEEE International Symposium on High Performance Distributed Computing (Cat. No.97TB100183)","volume":"8 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"1997-08-05","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"122526411","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Speed up your database client with adaptable multithreaded prefetching","authors":"Nils Knafla","doi":"10.1109/HPDC.1997.622367","DOIUrl":"https://doi.org/10.1109/HPDC.1997.622367","url":null,"abstract":"In many client/server object database applications, performance is limited by the delay in transferring pages from the server to the client. We present a prefetching technique that can avoid this delay, especially where there are several database servers. Part of the novelty of this approach lies in the way that multithreading on the client workstation is exploited, in particular for activities such as prefetching and flushing dirty pages to the server. Using our own complex object benchmark we analyze the performance of the prefetching technique with multiple clients, multiple servers and different buffer pool sizes.","PeriodicalId":243171,"journal":{"name":"Proceedings. The Sixth IEEE International Symposium on High Performance Distributed Computing (Cat. No.97TB100183)","volume":"13 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"1997-08-05","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"122615787","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Performance aspects of switched SCI systems","authors":"M. Liebhart","doi":"10.1109/HPDC.1997.626408","DOIUrl":"https://doi.org/10.1109/HPDC.1997.626408","url":null,"abstract":"The Scalable Coherent Interface (SCI) defines a high-speed interconnect that provides a coherent distributed shared memory system. With the use of switches separate rings can be connected to form large topology-independent configurations. It has been realized that congestion in SCI systems generates additional retry traffic which reduces the available communication bandwidth. This paper investigates additional flow control mechanisms for overloaded switches. They are based on a supplementary retry delay and show a significant throughput gain. Furthermore two different management schemes for the output buffers are investigated. Computer simulations are used to compare the models and to determine system parameters.","PeriodicalId":243171,"journal":{"name":"Proceedings. The Sixth IEEE International Symposium on High Performance Distributed Computing (Cat. No.97TB100183)","volume":"51 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"1997-08-05","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"114684088","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Cut-through delivery in Trapeze: An exercise in low-latency messaging","authors":"K. Yocum, J. Chase, Andrew J. Gallatin, A. Lebeck","doi":"10.1109/HPDC.1997.626425","DOIUrl":"https://doi.org/10.1109/HPDC.1997.626425","url":null,"abstract":"New network technology continues to improve both the latency and bandwidth of communication in computer clusters. The fastest high-speed networks approach or exceed the I/O bus bandwidths of \"gigabit-ready\" hosts. These advances introduce new considerations for the design of network interfaces and messaging systems for low-latency communication. This paper investigates cut-through delivery, a technique for overlapping host I/O DMA transfers with network traversal. Cut-through delivery significantly reduces end-to-end latency of large messages, which are often critical for application performance. We have implemented cut-through delivery in Trapeze, a new messaging substrate for network memory and other distributed operating system services. Our current Trapeze prototype is capable of demand-fetching 8 K virtual memory pages in 200 /spl mu/s across a Myrinet cluster of DEC AlphaStations.","PeriodicalId":243171,"journal":{"name":"Proceedings. The Sixth IEEE International Symposium on High Performance Distributed Computing (Cat. No.97TB100183)","volume":"28 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"1997-08-05","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"115894133","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Design issues in building Web-based parallel programming environments","authors":"K. Dinçer, Geoffrey Fox","doi":"10.1109/HPDC.1997.626432","DOIUrl":"https://doi.org/10.1109/HPDC.1997.626432","url":null,"abstract":"We exploited the recent advances in Internet connectivity and Web technologies for building Web-based parallel programming environments (WPPEs) that facilitate the development and execution of parallel programs on remote high-performance computers. A Web browser running on the user's machine provides a user-friendly interface to server-site user accounts and allows the use of parallel computing platforms and software in a convenient manner. The user may create, edit, and execute files through this Web browser interface. This new Web-based client-server architecture has the potential of being used as a future front-end to high-performance computer systems. We discuss the design and implementation of several prototype WPPEs that are currently in use at the Northeast Parallel Architectures Center and the Cornell Theory Center. These initial prototypes support high-level parallel programming with Fortran 90 and High Performance Fortran (HPF), as well as explicit low-level programming with Message Passing Interface (MPI). We detail the lessons learned during the development process and outline the tradeoffs of various design choices in the realization of the design. We especially concentrate on providing server-site user accounts, mechanisms to access those accounts through the Web, and the Web-related system security issues.","PeriodicalId":243171,"journal":{"name":"Proceedings. The Sixth IEEE International Symposium on High Performance Distributed Computing (Cat. No.97TB100183)","volume":"184 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"1997-08-05","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"116354622","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Replaying distributed programs without message logging","authors":"Robert H. B. Netzer, Yikang Xu","doi":"10.1109/HPDC.1997.622370","DOIUrl":"https://doi.org/10.1109/HPDC.1997.622370","url":null,"abstract":"Debugging long program runs can be difficult because of the delays required to repeatedly re-run the execution. Even a moderately long run of five minutes can incur aggravating delays. To address this problem, techniques exist that allow re-executing a distributed program from intermediate points by using combinations of checkpointing and message logging. In this paper we explore another idea: how to support replay without logging the contents of any message. When no messages are logged, the set of global states from which replay is possible is constrained, and it has been unknown how to compute this set without exhaustively searching the space of all global states, whose size is exponential in the number of processes. We present a simple and efficient hybrid on-the-fly/post-mortem algorithm for detecting the necessary and sufficient conditions under which parts of the execution can be replayed without message logs. A small amount of trace (two vectors) is recorded at each checkpoint and a fast post-mortem algorithm computes global states from which replay can begin. This algorithm is independent of the checkpointing technique used.","PeriodicalId":243171,"journal":{"name":"Proceedings. The Sixth IEEE International Symposium on High Performance Distributed Computing (Cat. No.97TB100183)","volume":"57 2 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"1997-08-05","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"128635665","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
H. Topcuoglu, S. Hariri, W. Furmanski, J. Valente, Ilkyeun Ra, Dongmin Kim, Yoonhee Kim, Xue Bing, Baoqing Ye
{"title":"The software architecture of a virtual distributed computing environment","authors":"H. Topcuoglu, S. Hariri, W. Furmanski, J. Valente, Ilkyeun Ra, Dongmin Kim, Yoonhee Kim, Xue Bing, Baoqing Ye","doi":"10.1109/HPDC.1997.622361","DOIUrl":"https://doi.org/10.1109/HPDC.1997.622361","url":null,"abstract":"The requirements of grand challenge problems and the deployment of gigabit networks makes the network computing framework an attractive and cost effective computing environment with which to interconnect geographically distributed processing and storage resources. Our project, Virtual Distributed Computing Environment (VDCE), provides a problem-solving environment for high-performance distributed computing over wide area networks. VDCE delivers well-defined library functions that relieve end-users of tedious task implementations and also support reusability. In this paper we present the conceptual design of VDCE software architecture, which is defined in three modules: (a) the Application Editor, a user-friendly application development environment that generates the Application Flow Graph (AFG) of an application; (b) the Application Scheduler, which provides an efficient task-to-resource mapping of AFG; and (c) the VDCE Runtime System, which is responsible for running and managing application execution and monitoring the VDCE resources.","PeriodicalId":243171,"journal":{"name":"Proceedings. The Sixth IEEE International Symposium on High Performance Distributed Computing (Cat. No.97TB100183)","volume":"33 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"1997-08-05","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"125377691","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}