{"title":"Runtime support for parallelization of data-parallel applications on adaptive and nonuniform computational environments","authors":"M. Kaddoura, S. Ranka","doi":"10.1109/HPDC.1996.546171","DOIUrl":"https://doi.org/10.1109/HPDC.1996.546171","url":null,"abstract":"In this paper, we discuss the runtime support required for the parallelization of unstructured data-parallel applications on nonuniform and adaptive environments. The approach presented is reasonably general and is applicable to a wide variety of regular as well as irregular applications. We present performance results for the solution of an unstructured mesh on a cluster of heterogeneous workstations.","PeriodicalId":267002,"journal":{"name":"Proceedings of 5th IEEE International Symposium on High Performance Distributed Computing","volume":null,"pages":null},"PeriodicalIF":0.0,"publicationDate":"1996-12-31","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"132077091","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Dynamically controlling false sharing in distributed shared memory","authors":"V. Freeh, G. Andrews","doi":"10.1109/HPDC.1996.546211","DOIUrl":"https://doi.org/10.1109/HPDC.1996.546211","url":null,"abstract":"Distributed shared memory (DSM) alleviates the need to program message passing explicitly on a distributed-memory machine. In order to reduce memory latency, a DSM replicates copies of data. This paper examines several current approaches to controlling thrashing caused by false sharing in a DSM. Then it introduces a novel memory consistency protocol, writer-owns, which detects and eliminates false sharing at run time. In iterative computations, where the data is accessed similarly every iteration, the writer-owns protocol can have tremendous benefits because the overhead of eliminating false sharing is only incurred once. Performance results show that the writer-owns protocol is competitive with and often better than existing approaches.","PeriodicalId":267002,"journal":{"name":"Proceedings of 5th IEEE International Symposium on High Performance Distributed Computing","volume":null,"pages":null},"PeriodicalIF":0.0,"publicationDate":"1996-08-06","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"121555708","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Run-time statistical estimation of task execution times for heterogeneous distributed computing","authors":"Michael A. Iverson, F. Özgüner, G. Follen","doi":"10.1109/HPDC.1996.546196","DOIUrl":"https://doi.org/10.1109/HPDC.1996.546196","url":null,"abstract":"An efficient run time, statistical scheme for estimating the execution time of a task is presented, in order to facilitate run time matching and scheduling in a distributed heterogeneous computing environment. This scheme is based upon a nonparametric regression technique, where the execution time estimate for a task is computed from past observations. Furthermore, this technique is able to compensate for different parameters upon which the execution time depends, and does not require any knowledge of the architecture of the target machine. It is also able to make accurate predictions when erroneous data is present in the set of observations, and has been experimentally shown to produce estimates with very low error even with few past values from which to calculate a new estimate.","PeriodicalId":267002,"journal":{"name":"Proceedings of 5th IEEE International Symposium on High Performance Distributed Computing","volume":null,"pages":null},"PeriodicalIF":0.0,"publicationDate":"1996-08-06","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"128094576","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"UbiWorld: an environment integrating virtual reality, supercomputing and design","authors":"M. Papka, R. Stevens","doi":"10.1109/HPDC.1996.546200","DOIUrl":"https://doi.org/10.1109/HPDC.1996.546200","url":null,"abstract":"Summary form only given. UbiWorld is a concept that ties together the notion of ubiquitous computing (Ubicomp) with that of using virtual reality for rapid prototyping. The goal is to develop an environment where one can explore Ubicomp-type concepts without having to build real Ubicomp hardware. The basic notion is to extend object models in a virtual world using distributed wide-area heterogeneous computing technology to provide complex networking and processing capabilities to virtual reality objects. Starting with the CAVE/sup TM/ family of display devices, we integrate tools for the construction of 30 objects into the existing library. Then, using these objects as models, we can embed new information technology within them. The plan is then to couple the virtual objects to remote computers via fine-grain heterogenous computing technology to provide Ubicomp behavior and functionality to the modeled objects. We tightly couple the process-defined behavior with the 30 objects and place these objects into rooms, creating a shared virtual world where users can experiment with using the virtual devices. Each object in the world has its behavior controlled by a program running some place on the network. This behavior could be one that in the real object would be provided by a local computer or by a combination of local computer and network connection to remote processors or databases. These \"behavior\" processes are able to communicate with each other using a shared protocol (UbiWorldcomm). These object also react and are influenced directly by interactions with the virtual world and users.","PeriodicalId":267002,"journal":{"name":"Proceedings of 5th IEEE International Symposium on High Performance Distributed Computing","volume":null,"pages":null},"PeriodicalIF":0.0,"publicationDate":"1996-08-06","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"125741844","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Programmability and service creation for multimedia networks","authors":"A. Lazar, K. Lim","doi":"10.1109/HPDC.1996.546191","DOIUrl":"https://doi.org/10.1109/HPDC.1996.546191","url":null,"abstract":"The Binding Architecture is an open architecture for building multimedia networks that must guarantee quality of service (QOS). We explore the notion of a service in the Binding Architecture and propose a conceptual model for building scalable multimedia distribution services based on it. We begin first by examining the relation between resources, their abstractions and the services that can be built from them and use this to derive a general model for binding. Based on this model, we identify a general set of capabilities required for building any multimedia distribution service. We also describe how these capabilities can be incorporated into our view of the service creation process. Finally, we augment our discussion with the description of an example service which we have developed using this paradigm.","PeriodicalId":267002,"journal":{"name":"Proceedings of 5th IEEE International Symposium on High Performance Distributed Computing","volume":null,"pages":null},"PeriodicalIF":0.0,"publicationDate":"1996-08-06","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"131656476","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Supporting HPDC applications over ATM networks with cell-based transport mechanisms","authors":"Joan Vila-Sallent, J. Solé-Pareta","doi":"10.1109/HPDC.1996.546230","DOIUrl":"https://doi.org/10.1109/HPDC.1996.546230","url":null,"abstract":"We address the problem of supporting high performance distributed computing (HPDC) applications running over ATM networks. For this purpose, we consider a logically separate subnetwork for these applications. After presenting an architectural reference model for the HPDC subnetwork and distinguishing which functions should be installed over the ATM network in order to satisfy the needs of HPDC applications, we propose two mechanisms that aim at optimizing communications by taking advantage of both the special properties of HPDC traffic and the cell based nature of ATM. The performance of these mechanisms is evaluated and compared with that achieved by the SSCOP protocol. The results show that when the ATM network experiences high load and the HPDC applications make an intensive use of arrays, cell based mechanisms become more robust than standard SSCOP and provide low latency and efficient cell loss recovery. Since both situations are very likely to occur in HPDC environments, we conclude that the introduction of cell based retransmission mechanisms does contribute to enhance performance of HPDC systems over ATM networks.","PeriodicalId":267002,"journal":{"name":"Proceedings of 5th IEEE International Symposium on High Performance Distributed Computing","volume":null,"pages":null},"PeriodicalIF":0.0,"publicationDate":"1996-08-06","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"132811519","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Phyllis E. Crandall, Eranti V. Sumithasri, M. Clement
{"title":"Performance comparison of desktop multiprocessing and workstation cluster computing","authors":"Phyllis E. Crandall, Eranti V. Sumithasri, M. Clement","doi":"10.1109/HPDC.1996.546197","DOIUrl":"https://doi.org/10.1109/HPDC.1996.546197","url":null,"abstract":"The paper describes initial findings regarding the performance tradeoffs between cluster computing where the participating processors are independent machines connected by a high speed switch and desktop multiprocessing where the processors reside within a single workstation and share a common memory. While interprocessor communication time has typically been cited as the limiting force on performance in the cluster, bus and memory contention have had similar effects in shared memory systems. The advent of high speed interconnects and improved bus and memory access speeds have enhanced the performance curves of both platforms. We present comparisons of the execution times of three applications with varying levels of data dependencies-numerical integration, matrix multiplication, and Jacobi iteration across three environments: the PVM distributed memory model, the PVM shared memory model, and the Solaris threads package.","PeriodicalId":267002,"journal":{"name":"Proceedings of 5th IEEE International Symposium on High Performance Distributed Computing","volume":null,"pages":null},"PeriodicalIF":0.0,"publicationDate":"1996-08-06","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"133726452","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Implementation and performance of a parallel file system for high performance distributed applications","authors":"W. Ligon, R. Ross","doi":"10.1109/HPDC.1996.546218","DOIUrl":"https://doi.org/10.1109/HPDC.1996.546218","url":null,"abstract":"Dedicated cluster parallel computers (DCPCs) are emerging as low-cost high performance environments for many important applications in science and engineering. A significant class of applications that perform well on a DCPC are coarse-grain applications that involve large amounts of file I/O. Current research in parallel file systems for distributed systems is providing a mechanism for adapting these applications to the DCPC environment. We present the Parallel Virtual File System (PVFS), a system that provides disk striping across multiple nodes in a distributed parallel computer and file partitioning among tasks in a parallel program. PVFS is unique among similar systems in that it uses a stream-based approach that represents each file access with a single set of request parameters and decouples the number of network messages from details of the file striping and partitioning. PVFS also provides support for efficient collective file accesses and allows overlapping file partitions. We present results of early performance experiments that show PVFS achieves excellent speedups in accessing moderately sized file segments.","PeriodicalId":267002,"journal":{"name":"Proceedings of 5th IEEE International Symposium on High Performance Distributed Computing","volume":null,"pages":null},"PeriodicalIF":0.0,"publicationDate":"1996-08-06","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"124845426","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"A light-weight application sharing infrastructure for graphics intensive applications","authors":"M. Hao, D. Lee, J. Sventek","doi":"10.1109/HPDC.1996.546181","DOIUrl":"https://doi.org/10.1109/HPDC.1996.546181","url":null,"abstract":"We describe a lightweight application sharing infrastructure that enables collaborative design using graphics intensive applications over low bandwidth networks. The basis of the technology employs an event driven-mechanism to share a reduced event set among multiple copies of an application executing on different workstations. This technology is referred as RES-AP (Reduced Event Set Application Sharing). RES-AP allows geographically-dispersed engineers to work together on large complex problems with fast responses. This capability is achieved without modification to the applications or to the window system software.","PeriodicalId":267002,"journal":{"name":"Proceedings of 5th IEEE International Symposium on High Performance Distributed Computing","volume":null,"pages":null},"PeriodicalIF":0.0,"publicationDate":"1996-08-06","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"127384954","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"A process migration subsystem for a workstation-based distributed systems","authors":"K. Al-Tawil, M. Bozyigit, S. Naseer","doi":"10.1109/HPDC.1996.546222","DOIUrl":"https://doi.org/10.1109/HPDC.1996.546222","url":null,"abstract":"Workstation based distributed computing environments are getting popular in both academic and commercial communities, due to the continuing trend of decreasing cost/performance ratio and rapid development of networking technology. However, the workload on these workstations is usually much lower than their computing capacity, especially with the ever increasing computing power of new hardware. As a result, the resources of such workstations are often under utilized and many of them are frequently idle. A preemptive process migration facility can be provided, in such a distributed system, to dynamically relocate running processes among the component machines. Such relocation can help cope with dynamic fluctuations in loads and service needs, improve the system's fault tolerance, meet real time scheduling deadlines, or bring a process to a special device. The paper presents a process migration subsystem for tolerating process and node failures on a workstation based environment. The design and implementation of the subsystem are also discussed.","PeriodicalId":267002,"journal":{"name":"Proceedings of 5th IEEE International Symposium on High Performance Distributed Computing","volume":null,"pages":null},"PeriodicalIF":0.0,"publicationDate":"1996-08-06","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"130876696","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}