Sung-Yong Park, S. Hariri, Yoonhee Kim, J. Harris, Rajesh Yadav
{"title":"NYNET Communication System (NCS): a multithreaded message passing tool over ATM network","authors":"Sung-Yong Park, S. Hariri, Yoonhee Kim, J. Harris, Rajesh Yadav","doi":"10.1109/HPDC.1996.546217","DOIUrl":"https://doi.org/10.1109/HPDC.1996.546217","url":null,"abstract":"Current advances in processor technology, and the rapid development of high speed networking technology, such as ATM, have made high performance network computing an attractive computing environment for large-scale high performance distributed computing (HPDC) applications. However, due to the communications overhead at the host-network interface, most of the HPDC applications are not getting the full benefit of high speed communication networks. This overhead can be attributed to the high cost of operating system calls, context switching, the use of inefficient communication protocols, and the coupling of data and control paths. We present an architecture and implementation for a low-latency, high-throughput message passing tool, that we refer to as the NYNET (ATM wide area network testbed in New York state) Communication System (NCS), which can support a variety of HPDC applications with different Quality of Services (QOS) requirements. NCS uses multithreading to provide efficient techniques that overlap computation and communication. NCS uses read/write trap routines to bypass traditional operating system calls. This reduces latency and avoids using inefficient communication protocols. By separating data and control paths, NCS eliminates unnecessary control transfers. This optimizes the data path and improves performance. Benchmarking results show that the performance of NCS is at least a factor of two better than the performance of corresponding p4 and PVM primitives.","PeriodicalId":267002,"journal":{"name":"Proceedings of 5th IEEE International Symposium on High Performance Distributed Computing","volume":null,"pages":null},"PeriodicalIF":0.0,"publicationDate":"1996-08-06","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"126369392","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Employing logic-enhanced memory for high-performance ATM network interfaces","authors":"Henky Agusleo, N. Soparkar","doi":"10.1109/HPDC.1996.546188","DOIUrl":"https://doi.org/10.1109/HPDC.1996.546188","url":null,"abstract":"With the advent of asynchronous transfer mode (ATM), network speeds are increasing substantially, and the greater communication bandwidth provides considerable potential for distributed applications. However, communication throughput delivered to the user applications has not increased as rapidly: network interfacing has become the next bottleneck. Consequently, better designs for network interfaces are essential to take full advantage of the higher-speed communication links such as the ATM. We present a novel hardware architecture for an ATM network interface employing memory with select on-chip data processing logic; i.e., logic-enhanced memory (LEM). We exhibit how LEM-based interfaces are particularly suitable for servers, where the majority of the operations are memory-intensive processing for client requests.","PeriodicalId":267002,"journal":{"name":"Proceedings of 5th IEEE International Symposium on High Performance Distributed Computing","volume":null,"pages":null},"PeriodicalIF":0.0,"publicationDate":"1996-08-06","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"127586467","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Engineering parallel algorithms","authors":"Niandong Fang","doi":"10.1109/HPDC.1996.546193","DOIUrl":"https://doi.org/10.1109/HPDC.1996.546193","url":null,"abstract":"The rise of explicit parallel programming involves new problems: lack of structure for parallel algorithms and the ad hoc development of parallel algorithms. We use skeletons to characterize and design parallel algorithms and define a process to refine the designs step by step into programs. The paper introduces a high level library on top of MPI which is derived from the skeleton concept to achieve better programmability and obtain portability. We conclude with a CFD application to demonstrate our idea.","PeriodicalId":267002,"journal":{"name":"Proceedings of 5th IEEE International Symposium on High Performance Distributed Computing","volume":null,"pages":null},"PeriodicalIF":0.0,"publicationDate":"1996-08-06","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"129059639","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
R. Popp, K. Pattipati, Y. Bar-Shalom, R. R. Gassner
{"title":"Multitarget tracking algorithm parallelization for distributed-memory computing systems","authors":"R. Popp, K. Pattipati, Y. Bar-Shalom, R. R. Gassner","doi":"10.1109/HPDC.1996.546212","DOIUrl":"https://doi.org/10.1109/HPDC.1996.546212","url":null,"abstract":"We present a robust scalable parallelization of a multitarget tracking algorithm developed for air traffic surveillance. We couple the state estimation and data association problems by embedding an interacting multiple model (IMM) state estimator into an optimization-based assignment framework. A SPMD distributed-memory parallelization is described wherein the interface to the optimization problem, namely computing the rather numerous gating and IMM state estimates, covariance calculations, and likelihood function evaluations (used as cost coefficients in the assignment problem), is parallelized. We describe several heuristic algorithms developed for the inherent task allocation problem wherein the problem is one of assigning track tasks, having uncertain processing costs and negligible communication costs, across a set of homogeneous processors to minimize workload imbalances. Using a measurement database based on two FAA air traffic central radars, courtesy of Rome Laboratory, we show that near linear speedups are obtainable on a 32-node Intel Paragon supercomputer using simple task allocation algorithms.","PeriodicalId":267002,"journal":{"name":"Proceedings of 5th IEEE International Symposium on High Performance Distributed Computing","volume":null,"pages":null},"PeriodicalIF":0.0,"publicationDate":"1996-08-06","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"133549015","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"A source-level transformation framework for RPC-based distributed programs","authors":"Tae-hyung Kim, James M. Purtilo","doi":"10.1109/HPDC.1996.546176","DOIUrl":"https://doi.org/10.1109/HPDC.1996.546176","url":null,"abstract":"The remote procedure call (RPC) paradigm has been a favorite of programmers who write distributed programs because RPC uses a familiar procedure call abstraction as the sole mechanism of remote operation. The abstraction helps to simplify programming tasks, but this does not mean that the resulting program's RPC-based flow of control will be anything close to ideal for high performance. The purpose of our research is to provide a source-level transformation framework as an alternative way to implement an RPC-based distributed program, so that the code can be optimized through program analysis techniques.","PeriodicalId":267002,"journal":{"name":"Proceedings of 5th IEEE International Symposium on High Performance Distributed Computing","volume":null,"pages":null},"PeriodicalIF":0.0,"publicationDate":"1996-08-06","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"132467137","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Marc A. Nurmi, William E. Bejcek, Rod N. Gregoire, Li-Yu Daisy Liu, Mark D. Pohl
{"title":"Automatic management of CPU and I/O bottlenecks in distributed applications on ATM networks","authors":"Marc A. Nurmi, William E. Bejcek, Rod N. Gregoire, Li-Yu Daisy Liu, Mark D. Pohl","doi":"10.1109/HPDC.1996.546219","DOIUrl":"https://doi.org/10.1109/HPDC.1996.546219","url":null,"abstract":"Existing parallel programming environments for networks of workstations improve the performance of computationally intensive applications by using message passing or virtual shared memory to alleviate CPU bottlenecks. This paper describes an approach based on message passing that addresses both CPU and I/O bottlenecks for a specific class of distributed applications on ATM networks. ATM provides the bandwidth required to utilize multiple I/O channels in parallel. This paper also describes an environment based on distributed process management and centralized application management that implements the approach. The environment adds processes to a running application when necessary to alleviate CPU and I/O bottlenecks while managing process connections in a manner that is transparent to the application.","PeriodicalId":267002,"journal":{"name":"Proceedings of 5th IEEE International Symposium on High Performance Distributed Computing","volume":null,"pages":null},"PeriodicalIF":0.0,"publicationDate":"1996-08-06","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"128326248","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"The network video terminal","authors":"D. Sisalem, H. Schulzrinne, C. Sieckmeyer","doi":"10.1109/HPDC.1996.546167","DOIUrl":"https://doi.org/10.1109/HPDC.1996.546167","url":null,"abstract":"Currently, a variety of the MBONE video tools provide video conferencing capabilities on different platforms and with a variety of compression algorithms. However most of these tools lack the ability to interact with other media agents that might be used during a conferencing session. Such interaction is required, for example, for achieving lip synchronisation between audio and video streams or for quality of service control. In this paper we present a new video tool, NEVIT. This tool provides the basic capabilities needed for video conferencing services such as video capturing, compression and decompression engines and multicasting and ATM network interfaces. To ease the interaction with other media agents, NEVIT incorporates a message handling facility to interact over a local conference bus with other media agents, a floor controller of the conference controller. Currently, we are working on adding lip synchronisation and quality-of-service control using this conference bus.","PeriodicalId":267002,"journal":{"name":"Proceedings of 5th IEEE International Symposium on High Performance Distributed Computing","volume":null,"pages":null},"PeriodicalIF":0.0,"publicationDate":"1996-08-06","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"130919090","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"\"Media-on-demand\" multimedia electronic mail: a tool for collaboration on the Web","authors":"K. Tsoi, S. Rahman","doi":"10.1109/HPDC.1996.546180","DOIUrl":"https://doi.org/10.1109/HPDC.1996.546180","url":null,"abstract":"Undoubtedly, multimedia electronic mail has many advantages in exchanging information electronically in collaborative work. The existing design of an e-mail systems architecture is inefficient in exchanging a multimedia message which has a much larger volume, and requires more bandwidth and storage space than the text-only messages. We present an innovative method for exchanging multimedia mail messages in a heterogeneous environment to support collaborative work over WWW on the Internet. We propose a \"Parcel Collection\" approach for exchanging multimedia electronic mail messages. This approach for exchanging multimedia electronic mail messages integrates the current WWW technologies with the existing electronic mail systems.","PeriodicalId":267002,"journal":{"name":"Proceedings of 5th IEEE International Symposium on High Performance Distributed Computing","volume":null,"pages":null},"PeriodicalIF":0.0,"publicationDate":"1996-08-06","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"129911587","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Approaches for a reliable high-performance distributed-parallel storage system","authors":"Q. Malluhi, W. Johnston","doi":"10.1109/HPDC.1996.546221","DOIUrl":"https://doi.org/10.1109/HPDC.1996.546221","url":null,"abstract":"The paper studies different schemes to enhance the reliability, availability and security of a high performance distributed storage system. We have previously designed a distributed parallel storage system that employs the aggregate bandwidth of multiple data servers connected by a high speed wide area network to achieve scalability and high data throughput. The general approach of the paper employs erasure error correcting codes to add data redundancy that can be used to retrieve missing information caused by hardware, software, or human faults. The paper suggests techniques for reducing the communication and computation overhead incurred while retrieving missing data blocks form redundant information. These techniques include clustering, multidimensional coding, and the full two dimensional parity scheme.","PeriodicalId":267002,"journal":{"name":"Proceedings of 5th IEEE International Symposium on High Performance Distributed Computing","volume":null,"pages":null},"PeriodicalIF":0.0,"publicationDate":"1996-08-06","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"128826340","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"A federated model for scheduling in wide-area systems","authors":"J. Weissman, A. Grimshaw","doi":"10.1109/HPDC.1996.546225","DOIUrl":"https://doi.org/10.1109/HPDC.1996.546225","url":null,"abstract":"A model for scheduling in wide area systems is described. The model is federated and utilizes a collection of local site schedulers that control the use of their resources. The wide area scheduler consults the local site schedulers to obtain candidate machine schedules. A set of issues and challenges inherent to wide area scheduling are also described and the proposed model is shown to address many of these problems. A distributed algorithm for wide area scheduling is presented and relies upon information made available about the resource needs of user jobs. The wide area scheduler will be implemented in Legion, a wide area computing system developed at the University of Virginia.","PeriodicalId":267002,"journal":{"name":"Proceedings of 5th IEEE International Symposium on High Performance Distributed Computing","volume":null,"pages":null},"PeriodicalIF":0.0,"publicationDate":"1996-08-06","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"122709220","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}