{"title":"High-Performance MPI Broadcast Algorithm for Grid Environments Utilizing Multi-lane NICs","authors":"Tatsuhiro Chiba, Toshio Endo, S. Matsuoka","doi":"10.1109/CCGRID.2007.59","DOIUrl":"https://doi.org/10.1109/CCGRID.2007.59","url":null,"abstract":"The performance of MPI collective operations, such as broadcast and reduction, is heavily affected by network topologies, especially in grid environments. Many techniques to construct efficient broadcast trees have been proposed for grids. On the other hand, recent high performance computing nodes are often equipped with multi-lane network interface cards (NICs), most previous collective communication methods fail to harness effectively. Our new broadcast algorithm for grid environments harnesses almost all downward and upward bandwidths of multi-lane NICs; A message to be broadcast is split into two pieces, which are broadcast along two independent binary trees in a pipelined fashion, and swapped between both trees. The salient feature of our algorithm is generality; it works effectively on both large clusters and grid environments. It can be also applied to nodes with a single NIC, by making multiple sockets share the NIC. Experimentations on a emulated network environment show that we achieve higher performance than traditional methods, regardless of network topologies or the message sizes.","PeriodicalId":278535,"journal":{"name":"Seventh IEEE International Symposium on Cluster Computing and the Grid (CCGrid '07)","volume":"5 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2007-05-14","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"121548863","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"An On-line Algorithm for Fair-Share Node Allocations in a Cluster","authors":"Lior Amar, A. Barak, Ely Levy, Michael Okun","doi":"10.1109/CCGRID.2007.22","DOIUrl":"https://doi.org/10.1109/CCGRID.2007.22","url":null,"abstract":"Proportional (fair) share schedulers are designed to provide applications with predefined portions of system resources. Single node operating systems use context-switch (preemption) to dynamically allocate the CPU(s) to running processes. This paper presents an online algorithm for proportional share allocations of nodes in a cluster, in a fashion that resembles a single-node system. The algorithm relies on preemptive process migrations for dynamic allocations of nodes to users. The paper presents the algorithm and its performance on a MOSIX organizational Grid with 60 nodes. We show that proportional share allocations can be achieved in a relatively short time (minutes).","PeriodicalId":278535,"journal":{"name":"Seventh IEEE International Symposium on Cluster Computing and the Grid (CCGrid '07)","volume":"39 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2007-05-14","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"127876284","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
N. Andrade, Jaindson Santana, F. Brasileiro, W. Cirne
{"title":"On the Efficiency and Cost of Introducing QoS in BitTorrent","authors":"N. Andrade, Jaindson Santana, F. Brasileiro, W. Cirne","doi":"10.1109/CCGRID.2007.75","DOIUrl":"https://doi.org/10.1109/CCGRID.2007.75","url":null,"abstract":"BitTorrent is currently a de facto standard for scalable content-distribution. However, its peer-to-peer model for resource allocation does not provide high availability and its performance depends on best-effort contributions given by peers. This has motivated several content-providers to use a hybrid model in which they operate a superpeer in order to attain a higher quality of service. In this paper, we use BitTorrent traces and analytical modelling to investigate the cost incurred by such an entity in relation to the benefits it can provide to the system.","PeriodicalId":278535,"journal":{"name":"Seventh IEEE International Symposium on Cluster Computing and the Grid (CCGrid '07)","volume":"21 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2007-05-14","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"126559474","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
V. Kasam, J. Salzemann, N. Jacq, A. Maaß, V. Breton
{"title":"Large Scale Deployment of Molecular Docking Application on Computational Grid infrastructures for Combating Malaria","authors":"V. Kasam, J. Salzemann, N. Jacq, A. Maaß, V. Breton","doi":"10.1109/CCGRID.2007.66","DOIUrl":"https://doi.org/10.1109/CCGRID.2007.66","url":null,"abstract":"Computational grids are solutions for several biological applications like virtual screening or molecular dynamics where large amounts of computing power and storage are required. The WISDOM project successfully deployed virtual screening at large scale on EGEE grid infrastructures in the summer 2005 and achieved 46 million dockings in 45 days, which is equivalent to 80 CPU years. WISDOM is one good example of a successful deployment of an embarrassingly parallel application. In this paper, we describe the improvements in our deployment. We screened ZINC database against four targets implicated in malaria. During more than 2 months and a half, we have achieved 140 million dockings, representing an average throughput of almost 80,000 dockings per hour. This was made possible by the availability of thousands of CPUs through different infrastructures worldwide. Through the acquired experience, the WISDOM production environment is evolving to enable an easy and fault- tolerant deployment of biological tools.","PeriodicalId":278535,"journal":{"name":"Seventh IEEE International Symposium on Cluster Computing and the Grid (CCGrid '07)","volume":"94 12 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2007-05-14","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"128953123","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Adaptive Performance Modeling on Hierarchical Grid Computing Environments","authors":"W. Nasri, L. Steffenel, D. Trystram","doi":"10.1109/CCGRID.2007.17","DOIUrl":"https://doi.org/10.1109/CCGRID.2007.17","url":null,"abstract":"In the past, efficient parallel algorithms have always been developed specifically for the successive generations of parallel systems (vector machines, shared-memory machines, distributed-memory machines, etc.). Today, due to many reasons, such as the inherent heterogeneity, the diversity, and the continuous evolution of the existing parallel execution supports, it is very hard to solve efficiently a target problem by using a single algorithm or to write portable programs that perform well on any computational supports. Toward this goal, we propose a generic framework based on communication models and adaptive approaches in order to adaptively model performances on grid computing environments. We apply this methodology on collective communication operations and show, by achieving experiments on a real platform, that the framework provides significant performances while determining the best combination model- algorithm depending on the problem and architecture parameters.","PeriodicalId":278535,"journal":{"name":"Seventh IEEE International Symposium on Cluster Computing and the Grid (CCGrid '07)","volume":"85 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2007-05-14","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"132889336","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Jie Yin, Junwei Cao, Yuexuan Wang, Lianchen Liu, Cheng Wu
{"title":"Scheduling Remote Access to Scientific Instruments in Cyberinfrastructure for Education and Research","authors":"Jie Yin, Junwei Cao, Yuexuan Wang, Lianchen Liu, Cheng Wu","doi":"10.1109/CCGRID.2007.103","DOIUrl":"https://doi.org/10.1109/CCGRID.2007.103","url":null,"abstract":"While a grid represents a computing infrastructure for cross domain sharing of computational resources, the cyberinfrastructure, proposed by the US NSF Blue - Ribbon advisory panel, is expected to revolutionizing science and engineering by including more computer integrated resources, e.g. telescopes and observatories. As a part of the China national cyberinfrastructure for education and research, resource sharing of expensive scientific instruments is discussed in this work. A layered model of instrument pools is introduced and the process from submitting a job to instrument pools to obtaining results is analyzed. Fuzzy random scheduling algorithms are proposed in instrument pools when a job is submitted to one of instruments within a pool. The randomness lies in the probability which instrument could be chosen for an experiment and the fuzziness origins from vagueness of users' feedback opinions on experimental results. Users' feedback information is utilized to improve overall quality of service (QoS) of an instrument cyberinfrastructure. Several algorithms are provided to increase utilization of instruments providing higher QoS and decrease utilization of those with poor QoS. This is demonstrated in details using quantitative simulation results included in this paper.","PeriodicalId":278535,"journal":{"name":"Seventh IEEE International Symposium on Cluster Computing and the Grid (CCGrid '07)","volume":"9 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2007-05-14","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"123910038","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"UIMA GRID: Distributed Large-scale Text Analysis","authors":"M. T. Egner, M. Lorch, Edd Biddle","doi":"10.1109/CCGRID.2007.118","DOIUrl":"https://doi.org/10.1109/CCGRID.2007.118","url":null,"abstract":"This paper shows how loosely coupled compute resources, managed by Condor, can be leveraged together with IBM OmniFind to implement a scalable environment for text analysis based on the Unstructured Information Management Architecture (UIMA). Text analysis can be used to extract valuable knowledge from unstructured text data such as entities and their relationships. When applied to large amounts of data e.g., in the magnitude of several million documents, the process can be too time consuming to react to business needs. This becomes a particular problem when the rule sets, dictionaries, or taxonomies used by the text analysis components are changed to extract new information for a particular business purpose. Such changes may require that the entire set of documents must be reanalyzed. In the scenario motivating this work a constantly growing set of currently 10 million documents needs to frequently be re-processed to accommodate such changes. The text analysis algorithms deployed are very complex and compute intensive, requiring currently about 20 CPU-years for a full re-analysis. Through the distributed architecture discussed in this paper the re-analysis can be performed in one calendar month by opportunistically leveraging compute nodes from a heterogeneous Condor pool.","PeriodicalId":278535,"journal":{"name":"Seventh IEEE International Symposium on Cluster Computing and the Grid (CCGrid '07)","volume":"613 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2007-05-14","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"116469658","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
F. Porto, Othman Tajmouati, V. F. V. D. Silva, B. Schulze, F. Ayres
{"title":"QEF - Supporting Complex Query Applications","authors":"F. Porto, Othman Tajmouati, V. F. V. D. Silva, B. Schulze, F. Ayres","doi":"10.1109/CCGRID.2007.89","DOIUrl":"https://doi.org/10.1109/CCGRID.2007.89","url":null,"abstract":"This paper describes QEF a query evaluation framework designed to support complex applications on the grid. QEF has been extended to support querying within a number of different applications, including supporting scientific visualization and implementing a web service semantic search engine. Application requests take a form of a workflow in which tasks are represented as algebraic operators and specific data types are enveloped into a common tuple structure. The implemented system is automatically deployed into schedule grid nodes and autonomously manages query evaluation according to grid environment conditions. The generality of our approach has been tested with a number of applications leading to a full grid web service implementation available at http://codims. epfl. ch.","PeriodicalId":278535,"journal":{"name":"Seventh IEEE International Symposium on Cluster Computing and the Grid (CCGrid '07)","volume":"12 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2007-05-14","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"123650847","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"BBCLB: A Bulletin-Board based Cooperative Load Balance Strategy for Service Grid","authors":"Tianyu Wo, Liang Zhong, Chunming Hu, J. Huai","doi":"10.1109/CCGRID.2007.27","DOIUrl":"https://doi.org/10.1109/CCGRID.2007.27","url":null,"abstract":"Although many efforts have been put on the load balance in network and job scheduling systems, most of them, however, can not be applied in the service grid environment directly since they are often designed for a homogeneous system with limited scalability. It is still a challenge problem to balance the load among service grid nodes which are often highly dynamic, heterogeneous and linked by wide-area network. In this paper, we present a load balance strategy using several bulletin-boards as load intermediates among grid nodes. A modified thresholds based load transfer algorithm has been applied with a non-preemptive selection policy. Based on the strategy above, a load balance system is realized in CROWN, a service oriented grid middleware, and deployed in the CROWN testbed. The performance evaluations have shown that our strategy can effectively balance the load of service invocation, and improve the system throughput.","PeriodicalId":278535,"journal":{"name":"Seventh IEEE International Symposium on Cluster Computing and the Grid (CCGrid '07)","volume":"9 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2007-05-14","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"129075333","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Peter Tröger, H. Rajic, Andreas Haas, P. Domagalski
{"title":"Standardization of an API for Distributed Resource Management Systems","authors":"Peter Tröger, H. Rajic, Andreas Haas, P. Domagalski","doi":"10.1109/CCGRID.2007.109","DOIUrl":"https://doi.org/10.1109/CCGRID.2007.109","url":null,"abstract":"Today's cluster and grid environments demand the usage of product-specific APIs and tools for developing distributed applications. We give an overview of the distributed resource management application API (DRMAA) specification, which defines a common interface for job submission, control, and monitoring. The DRMAA specification was developed by the authors at the open grid forum standardization body, and has meanwhile significant adoption in academic and commercial cluster systems. Within this paper, we describe the basic concepts of the finalized API, and explain issues and findings with the standardization of such an unified interface.","PeriodicalId":278535,"journal":{"name":"Seventh IEEE International Symposium on Cluster Computing and the Grid (CCGrid '07)","volume":"2 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2007-05-14","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"127986001","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}