{"title":"Linear algebra algorithms in a heterogeneous cluster of personal computers","authors":"Jorge G. Barbosa, J. Tavares, A. J. Padilha","doi":"10.1109/HCW.2000.843740","DOIUrl":"https://doi.org/10.1109/HCW.2000.843740","url":null,"abstract":"Cluster computing is presently a major research area, mostly for high performance computing. The work presented refers to the application of cluster computing in a small scale where a virtual machine is composed of a small number of off-the-self-personal computers connected by a low cost network. A methodology to determine the optimal number of processors to be used in a computation is presented as well as the speedup results obtained for the matrix-matrix multiplication and for the symmetric QR algorithm for eigenvector computation which are significant building blocks for applications in the target image processing and analysis domain. The load balancing strategy is also addressed.","PeriodicalId":351836,"journal":{"name":"Proceedings 9th Heterogeneous Computing Workshop (HCW 2000) (Cat. No.PR00556)","volume":"75 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2000-05-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"125871522","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"MoBiDiCK: a tool for distributed computing on the Internet","authors":"M. Dharsee, C. Hogue","doi":"10.1109/HCW.2000.843755","DOIUrl":"https://doi.org/10.1109/HCW.2000.843755","url":null,"abstract":"We have developed a software tool called MoBiDiCK (Modular Big Distributed Computing Kernel) that is ultimately intended for distributed computing. In this paper, we detail the design and show results using the core components of MoBiDiCK running two different clients on a local cluster. MoBiDiCK is a database-driven system that can be used to marshal a large number of processors across the Internet in order to have them collaborate on a single computation. These utilize a message-passing API and control synchronization formalism we have developed that uses the HTTP standard and Web servers. CGI programs on the volunteer processors perform the computations. The problem domains best served by MoBiDiCK are parallel computing problems that are CPU-bound (not I/O-bound) and require minimal inter-process communication. The parallel tasks that we present include the analysis of databases of 3D protein structures and Monte Carlo simulations for ab-initio protein folding.","PeriodicalId":351836,"journal":{"name":"Proceedings 9th Heterogeneous Computing Workshop (HCW 2000) (Cat. No.PR00556)","volume":"94 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2000-05-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"116736172","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"A cost/benefit model for dynamic resource sharing","authors":"D. Katramatos, D. Saxena, Nehal Mehta, S. Chapin","doi":"10.1109/HCW.2000.843753","DOIUrl":"https://doi.org/10.1109/HCW.2000.843753","url":null,"abstract":"The use of multicomputer clusters composed of cheap workstations connected by high-speed networks is common in modern high-performance computing. However, operating system research in such environments has lagged. Our research aims at enhancing the functionality of the operating system by providing management functions that allow dynamic resource sharing and performance prediction in a clustered environment supporting distributed shared memory and multi-threading. Central to this approach is the development of a parametric cost model that can predict the performance ramifications of policy choices and allow applications and middleware to adapt to the computing environment and achieve better performance.","PeriodicalId":351836,"journal":{"name":"Proceedings 9th Heterogeneous Computing Workshop (HCW 2000) (Cat. No.PR00556)","volume":"32 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2000-05-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"128976630","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Agent-based resource discovery","authors":"K. Jun, Ladislau Bölöni, K. Palacz, D. Marinescu","doi":"10.1109/HCW.2000.843731","DOIUrl":"https://doi.org/10.1109/HCW.2000.843731","url":null,"abstract":"Presents a distributed discovery method allowing individual nodes to gather information about resources in a wide-area distributed system made up of autonomous systems linked together by a network technology substrate. We introduce an algorithm and a model for distributed awareness and a framework for the dynamic assembly of agents monitoring network resources. Whenever an agent needs detailed information about the individual components of another system, it uses the information gathered by the distributed awareness mechanism to identify the target system, then creates a description of a monitoring agent that is capable of providing the information about remote resources, and sends this description to the remote site. There, an agent factory dynamically assembles the monitoring agent. This solution is scalable and is suitable for heterogeneous environments where the architecture and the hardware resources of individual nodes differ, where the services provided by the system are diverse, and where the bandwidth and latency of the communication links cover a broad range.","PeriodicalId":351836,"journal":{"name":"Proceedings 9th Heterogeneous Computing Workshop (HCW 2000) (Cat. No.PR00556)","volume":"30 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2000-05-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"125502373","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Craig A. Lee, C. DeMatteis, J. Stepanek, John Wang
{"title":"Cluster performance and the implications for distributed, heterogeneous grid performance","authors":"Craig A. Lee, C. DeMatteis, J. Stepanek, John Wang","doi":"10.1109/HCW.2000.843749","DOIUrl":"https://doi.org/10.1109/HCW.2000.843749","url":null,"abstract":"Examines the issues surrounding efficient execution in heterogeneous grid environments. The performances of a Linux cluster and a parallel supercomputer are initially compared using both benchmarks and an application. With an understanding of how benchmark and application performance is affected by processor and interconnect speed, a comparison is made with the bandwidth and latencies available in a tested grid. Of significant concern is the fact that the available communication bandwidth and latencies have a dynamic range of 3 to 4 orders of magnitude, while processor speeds have a range of about one-half order of magnitude. Also, while both processor speed and network bandwidth are increasing very rapidly, simple propagation delay will become more significant in the network latencies seen by many grid applications. That is to say, the pipes in a grid will be getting fatter but not commensurately shorter. How are we to effectively utilize such an infrastructure? Clearly, an attractive approach is to require sufficient concurrency in the application such that a coarse-grain, data-driven model of execution can be used to hide latencies while hopefully keeping context-switching overheads low. If the \"spatial component\" of an application is understood, then runtime systems could also apply established techniques like caching, compression, estimation and speculative pre-fetching. Ideally, this low-level performance management should be encapsulated in an easy-to-use abstraction.","PeriodicalId":351836,"journal":{"name":"Proceedings 9th Heterogeneous Computing Workshop (HCW 2000) (Cat. No.PR00556)","volume":"242 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2000-05-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"123011294","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Shoukat Ali, H. Siegel, Muthucumaru Maheswaran, D. Hensgen, Sahra Ali
{"title":"Task execution time modeling for heterogeneous computing systems","authors":"Shoukat Ali, H. Siegel, Muthucumaru Maheswaran, D. Hensgen, Sahra Ali","doi":"10.1109/HCW.2000.843743","DOIUrl":"https://doi.org/10.1109/HCW.2000.843743","url":null,"abstract":"A distributed heterogeneous computing (HC) system consists of diversely capable machines harnessed together to execute a set of tasks that vary in their computational requirements. Heuristics are needed to map (match and schedule) tasks onto machines in an HC system so as to optimize some figure of merit. This paper characterizes a simulated HC environment by using the expected execution times of the tasks that arrive in the system onto the different machines present in the system. This information is arranged in an \"expected time to compute\" (ETC) matrix as a model of the given HC system, where the entry (i, j) is the expected execution time of task i on machine j. This model is needed to simulate different HC environments to allow testing of relative performance of different mapping heuristics under different circumstances. In particular the ETC model is used to express the heterogeneity among the runtimes of the tasks to be executed, and among the machines in the HC system. An existing range-based technique to generate ETC matrices is described. A coefficient-of-variation based technique to generate ETC matrices is proposed, and compared with the range-based technique. The coefficient-of-variation-based ETC generation method provides a greater control over the spread of values (i.e., heterogeneity) in any given row or column of the ETC matrix than the range-based method.","PeriodicalId":351836,"journal":{"name":"Proceedings 9th Heterogeneous Computing Workshop (HCW 2000) (Cat. No.PR00556)","volume":"34 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2000-05-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"132624648","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Reliable cluster computing with a new checkpointing RAID-x architecture","authors":"K. Hwang, Hai Jin, Roy S. C. Ho, Wonwoo Ro","doi":"10.1109/HCW.2000.843742","DOIUrl":"https://doi.org/10.1109/HCW.2000.843742","url":null,"abstract":"In a serverless cluster of PCs or workstations, the cluster must allow remote file accesses or parallel I/O directly performed over disks distributed to all client nodes. We introduce a new distributed disk array, called the RAID-x, for use in serverless clusters. The RAID-x architecture is based on an orthogonal striping and mirroring (OSM) scheme, which exploits full-bandwidth and protects the system from all single disk failures. The performance of the RAID-x is experimentally proven superior to RAID-1 and NFS in the Linux cluster environment. We propose a new striped checkpointing scheme, leveraging on striped parallelism and pipelined writing of successive disk stripes. This RAID-x architecture greatly enhances the throughput, reliability, and availability of scalable clusters. It appeals especially to I/O-centric cluster applications.","PeriodicalId":351836,"journal":{"name":"Proceedings 9th Heterogeneous Computing Workshop (HCW 2000) (Cat. No.PR00556)","volume":"71 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2000-05-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"123697901","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Claus Bitten, J. Gehring, U. Schwiegelshohn, R. Yahyapour
{"title":"The NRW-Metacomputer - building blocks for a worldwide computational grid","authors":"Claus Bitten, J. Gehring, U. Schwiegelshohn, R. Yahyapour","doi":"10.1109/HCW.2000.843730","DOIUrl":"https://doi.org/10.1109/HCW.2000.843730","url":null,"abstract":"Presents the results of the NRW-Metacomputing Taskforce, which has been working on the development of a (German) country-wide metacomputer since 1996. The resulting installation is among the very few that are already operational, have full support for heterogeneous resources, contain a decent security model and feature an advanced scheduling subsystem for the metacomputing environment. The NRW-Metacomputer has been implemented using a modular software architecture. Hence, its concepts and components can be re-used by others without the need to obtain the metacomputing software as a whole. Furthermore, the NRW-Metacomputer already provides well-defined interfaces for linking the system with other metacomputing environments to form a truly global computational grid. Distinctive features of this system are its highly scalable and fault-tolerant software architecture, its advanced resource planning mechanisms, as well as an integration into a DCE (Distributed Computing Environment)/DFS (Distributed File System) environment.","PeriodicalId":351836,"journal":{"name":"Proceedings 9th Heterogeneous Computing Workshop (HCW 2000) (Cat. No.PR00556)","volume":"57 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2000-05-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"116473807","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Master/slave computing on the Grid","authors":"Gary Shao, F. Berman, R. Wolski","doi":"10.1109/HCW.2000.843728","DOIUrl":"https://doi.org/10.1109/HCW.2000.843728","url":null,"abstract":"Resource selection is fundamental to the performance of master/slave applications. In this paper, we address the problem of promoting performance for distributed master/slave applications targeted to distributed, heterogeneous \"Grid\" resources. We present a work-rate-based model of master/slave application performance which utilizes both system and application characteristics to select potentially performance-efficient hosts for both the master and slave processes. Using a Grid allocation strategy based on this performance model, we demonstrate a performance improvement over other selection options for a representative set of master/slave applications in both simulated and actual Grid environments.","PeriodicalId":351836,"journal":{"name":"Proceedings 9th Heterogeneous Computing Workshop (HCW 2000) (Cat. No.PR00556)","volume":"91 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2000-05-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"127123159","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Shava Smallen, W. Cirne, J. Frey, F. Berman, R. Wolski, Mei-Hui Su, C. Kesselman, S. Young, Mark Ellisman
{"title":"Combining workstations and supercomputers to support grid applications: the parallel tomography experience","authors":"Shava Smallen, W. Cirne, J. Frey, F. Berman, R. Wolski, Mei-Hui Su, C. Kesselman, S. Young, Mark Ellisman","doi":"10.1109/HCW.2000.843748","DOIUrl":"https://doi.org/10.1109/HCW.2000.843748","url":null,"abstract":"Computational grids are becoming an increasingly important and powerful platform for the execution of large-scale, resource-intensive applications. However, it remains a challenge for applications to tap into the potential of grid resources in order to achieve performance. In this paper, we illustrate how work queue applications can leverage grids to achieve performance through coallocation. We describe our experiences in developing a scheduling strategy for a production tomography application targeted at grids that contain both workstations and parallel supercomputers. Our strategy uses dynamic information exported by a supercomputer's batch scheduler to simultaneously schedule tasks on workstations and immediately-available supercomputer nodes. This strategy is of great practical interest because it combines resources that are available in a typical research laboratory: time-shared workstations and CPU time in remote space-shared supercomputers. We show that this strategy improves the performance of the tomography application compared to traditional scheduling strategies, which target the application to either type of resource alone.","PeriodicalId":351836,"journal":{"name":"Proceedings 9th Heterogeneous Computing Workshop (HCW 2000) (Cat. No.PR00556)","volume":"12 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2000-05-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"129139983","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}