Quoc-Thuan Ho, T. Hung, W. Jie, H. Chan, E. Sindhu, S. Ganesan, T. Zang, Xiaorong Li
{"title":"GRASG - a framework for \"gridifying\" and running applications on service-oriented grids","authors":"Quoc-Thuan Ho, T. Hung, W. Jie, H. Chan, E. Sindhu, S. Ganesan, T. Zang, Xiaorong Li","doi":"10.1109/CCGRID.2006.48","DOIUrl":"https://doi.org/10.1109/CCGRID.2006.48","url":null,"abstract":"The convergence of grid computing technologies and Web services offers many opportunities to utilize resources distributed across the Internet and solves many issues of interoperability. As a result, enabling applications as Web services are required intensively. Hence, a framework for \"gridifying\" and running applications on service-oriented grids (GRASG) was built to offer developers a flexible and effective tool for \"gridifying\" applications and making use of distributed resources on grid environment without much effort from the developers. It allows users to quickly enable an application as a Web service and access this service in a simple fashion. Further, in order to make use of distributed resources, GRASG provides a metascheduling mechanism that is able to schedule jobs to grid resources using Web services protocol. These features reduce the time taken for application development and execution.","PeriodicalId":419226,"journal":{"name":"Sixth IEEE International Symposium on Cluster Computing and the Grid (CCGRID'06)","volume":"161 ","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2006-05-16","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"131691446","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Efficient many-to-one communication for a distributed RAID","authors":"A. Marco, G. Ciaccio","doi":"10.1109/CCGRID.2006.39","DOIUrl":"https://doi.org/10.1109/CCGRID.2006.39","url":null,"abstract":"Any set of autonomous workstations, however networked (by a LAN, a MAN, or wireless), can be seen as a collection of networked low cost disks. Such a collection can be operated by proper software so as to provide the abstraction of a single, larger block device, made available to all the participants on a peer-to-peer basis. By adding enough data redundancy, the disk collection as a whole could act as single distributed RAID, providing capacity and reliability along with the convenient price/performance typical of commodity hard disks. This paper reports about issues of communication performance in a prototype of distributed RAID device called DRAID. DRAID offers storage services under a single I/O space (SIOS) block device abstraction. The SIOS feature implies that the storage space is accessible through each of the participant stations, rather than through one or few fixed end-points. The paper focuses on the inefficiency of communication when a client reads data stripes from a number of remote servers in a gigabit Ethernet LAN. The congestion caused by such many-to-one communication pattern has been faced in multiple ways, but the best result has been obtained by modifying the traditional, and unsuccessful, congestion avoidance policy of TCP/IP.","PeriodicalId":419226,"journal":{"name":"Sixth IEEE International Symposium on Cluster Computing and the Grid (CCGRID'06)","volume":"39 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2006-05-16","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"131850090","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"GridFS: Targeting Data Sharing in Grid Environments","authors":"Marcelo Nery dos Santos, Renato Cerqueira","doi":"10.1109/CCGRID.2006.141","DOIUrl":"https://doi.org/10.1109/CCGRID.2006.141","url":null,"abstract":"GridFS is a system that enables data sharing in a cluster or grid environment. By deploying a set of servers over several nodes, it is possible to build a federated wide area file system integrating tera scale sized data. The data is stored in different hosts under a single name space controlled by GridFS. It was designed and developed considering interoperability, scalability and performance issues. Also, GridFS provides special functions for process schedulers, such as file transfer rate estimates, and supports legacy applications","PeriodicalId":419226,"journal":{"name":"Sixth IEEE International Symposium on Cluster Computing and the Grid (CCGRID'06)","volume":"68 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2006-05-16","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"133402958","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Cluster and Grid Based Classification of Transposable Elements in Eukaryotic Genomes","authors":"N. Ranganathan, C. Feschotte, David Levine","doi":"10.1109/CCGRID.2006.127","DOIUrl":"https://doi.org/10.1109/CCGRID.2006.127","url":null,"abstract":"In the last few years many computer and laboratory improvements in the production and analysis of DNA sequences have made possible the complete sequencing of whole genomes. This provides a wealth of raw genomes that needs to be processed and annotated. All eukaryotic genomes examined and published thus far contain repetitive DNA. The amount of repetitive DNA in any specific eukaryotic genome ranges from 5% to 80%. These repeats consist mainly of transposable elements and tandem repeats which need to be identified, classified and annotated in order to sequence and annotate an entire genome. This paper discusses the design and implementation of a distributed cluster and grid based workflow to classify transposable elements. We show experimental results for representative species genomes on a cluster and grid. The performance and results of the workflow with regard to turnaround time, scalability, load balancing, resource utilization and fault tolerance are shown and discussed","PeriodicalId":419226,"journal":{"name":"Sixth IEEE International Symposium on Cluster Computing and the Grid (CCGRID'06)","volume":"16 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2006-05-16","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"115192183","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Statistical Properties of Task Running Times in a Global-Scale Grid Environment","authors":"M. Dobber, R. Mei, G. Koole","doi":"10.1109/CCGRID.2006.98","DOIUrl":"https://doi.org/10.1109/CCGRID.2006.98","url":null,"abstract":"Grid computing technology connects globally distributed processors to develop an immense source of computing power, which enables us to run applications in parallel that would take orders of magnitude more time on a single processor. Key characteristics of a global-scale grid are the strong burstiness in the amount of load on the resources and on the network capacities, and the fact that processors may be appended to or removed from the grid at any time. To cope with these characteristics, it is essential to develop techniques that make applications robust against the dynamics of the grid environment. For these techniques to be effective, it is important to have an understanding of the statistical properties of the dynamics of a grid environment. Today, however, the statistical properties of the dynamic behavior of real global-scale grid environments are not well understood. Our main focus is on highly CPU-intensive grid applications that require huge amounts of processor power for running tasks. Motivated by this, we have performed extensive measurements in a real, global-scale grid environment to study the statistical properties of the running times of tasks on processors. We observe (1) a strong burstiness of the running times over different time scales, (2) a strong heterogeneity of the running-time characteristics among the different hosts, (3) a strong heterogeneity of the running-time characteristics for the same host over different time intervals, and (4) the occurrence of sudden level-switches in the running times, amongst others. These observations are used to develop effective techniques for the prediction of running times. They can be used to develop effective control schemes for robust grid applications.","PeriodicalId":419226,"journal":{"name":"Sixth IEEE International Symposium on Cluster Computing and the Grid (CCGRID'06)","volume":"20 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2006-05-16","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"124277737","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Dinanath Sulakhe, Alex Rodriguez, M. Wilde, Ian T Foster, N. Maltsev
{"title":"Using multiple grid resources for bioinformatics applications in GADU","authors":"Dinanath Sulakhe, Alex Rodriguez, M. Wilde, Ian T Foster, N. Maltsev","doi":"10.1109/CCGRID.2006.182","DOIUrl":"https://doi.org/10.1109/CCGRID.2006.182","url":null,"abstract":"During the past decade, the scientific community has witnessed the rapid accumulation of gene sequence data and data related to physiology and biochemistry of organisms. Bioinformatics tools used for efficient and computationally intensive analysis of genetic sequences require large-scale computational resources to accommodate the growing data. Grid computational resources such as the Open Science Grid and TeraGrid have proved useful for scientific discovery. GADU is a high-throughput computational system developed to automate the steps involved in accessing the Grid resources for running bioinformatics applications. This paper describes the requirements for building an automated scalable system such as GADU that can run a job simultaneously on different grids. The paper describes the resource-independent configuration of GADU using the Pegasus-based virtual data system that helps in using heterogeneous grid resources. The paper also highlights the features implemented to make GADU a gateway to computationally intensive bioinformatics applications on the Grid","PeriodicalId":419226,"journal":{"name":"Sixth IEEE International Symposium on Cluster Computing and the Grid (CCGRID'06)","volume":"2011 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2006-05-16","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"114744608","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"A feedback mechanism for network scheduling in LambdaGrids","authors":"P. Datta, Sushant Sharma, Wu-chun Feng","doi":"10.1109/CCGRID.2006.5","DOIUrl":"https://doi.org/10.1109/CCGRID.2006.5","url":null,"abstract":"Next-generation e-Science applications will require the ability to transfer information at high data rates between distributed computing centers and data repositories. A Lambda-Grid offers dedicated, optical, circuit-switched, point-to-point connections, which may be reserved exclusively for an application. Though such dedicated high-speed connections eliminate congestion in the network, they effectively push the network congestion out to the end systems, as processing speeds have not kept up with networking speeds. Therefore, developing an efficient transport protocol over such highspeed dedicated circuits is of critical importance. In this work, we propose the idea of a lightweight end-system protocol, based on performance monitoring, to significantly improve the performance of data transport over a LambdaGrid. In particular, we focus on dynamically monitoring the OS task scheduling at the receiving end-system so that potential end-system congestion may be detected early and appropriate feedback can be transmitted back to the sending end-system to avoid packet losses. One example of such an evasive action is to suspend transmission for certain duration of time during which the OS on the receiving end-system must handle other computational processes. With this in mind, we propose to extend the Reliable-Blast UDP (RBUDP) protocol to take such evasive action by using a simple feedback mechanism that is activated via performance monitoring. The new protocol, named RBUDP dramatically improves the performance of data transfer over LambdaGrids. We demonstrate the effectiveness of our proposed protocol and illustrate the performance gains achieved via network emulation.","PeriodicalId":419226,"journal":{"name":"Sixth IEEE International Symposium on Cluster Computing and the Grid (CCGRID'06)","volume":"43 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2006-05-16","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"121961438","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"An ontology-based conceptual mapping framework for translating FBPML to the Web services ontology","authors":"G. Nadarajan, Y. Chen-Burger","doi":"10.1109/CCGRID.2006.17","DOIUrl":"https://doi.org/10.1109/CCGRID.2006.17","url":null,"abstract":"This paper presents an ontology-based conceptual mapping framework that translates a formal and visually rich business process modeling (BPM) language, Fundamental Business Process Modelling Language (FBPML) to a semantic Web-based language, the Web Services Ontology (OWL-S). The translation aims to narrow the gap between enterprise modelling methods and semantic Web services, thus bringing the two communities closer. Another significant contribution of the translation is that it allows more mature technologies such as BPM methods to be utilised within emerging fields that are constantly evolving, such as the semantic Web. The framework is divided into a data model translation and a process model translation. An implementation and an evaluation of the process model translation are demonstrated and discussed.","PeriodicalId":419226,"journal":{"name":"Sixth IEEE International Symposium on Cluster Computing and the Grid (CCGRID'06)","volume":"38 6 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2006-05-16","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"125736479","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
S. Pallickara, Beth Plale, Liang Fang, Dennis Gannon
{"title":"End-to-end trustworthy data access in data-oriented scientific computing","authors":"S. Pallickara, Beth Plale, Liang Fang, Dennis Gannon","doi":"10.1109/CCGRID.2006.41","DOIUrl":"https://doi.org/10.1109/CCGRID.2006.41","url":null,"abstract":"Data-driven computational science on community computational resources is frequently of a magnitude and scale that it requires that computations be done remotely, generating resulting data collections that are too large to be shipped back to a user's workstation. Service-oriented middleware is well equipped to carry out actions on behalf of a user, but SOA middleware does not address user trust in the privacy of their actions and security of their data. In this paper we develop a model that represents the trust relationship between the users and their remote resources in the grid system. We show how one can construct a trusted relationship from the model, with an emphasis on the importance of context to a specific trust relationship.","PeriodicalId":419226,"journal":{"name":"Sixth IEEE International Symposium on Cluster Computing and the Grid (CCGRID'06)","volume":"11 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2006-05-16","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"125890284","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Segregate Applications at System Level to Eliminate Security Problems","authors":"C. Jong","doi":"10.1109/CCGRID.2006.165","DOIUrl":"https://doi.org/10.1109/CCGRID.2006.165","url":null,"abstract":"Improvements in advanced microprocessor design and cost/performance gains in hardware technology have changed the distributed computing paradigm from a homogeneous parallel computation to a heterogeneous cluster one. This new paradigm involves coordinating and sharing computing, application, data, storage, and network resources across dynamic and possibly geographically dispersed organizations. To attract organizations to take advantage of off-the-shelf ready-to-build commodity clusters, substantial improvements have been realized in many areas such as resource allocation and management, process distribution and recovery, data integrity and application security. However, the primary factor above all others as we approach this new level of computing is trust - higher confidence in the privacy and security of data and resources is needed to advance to the next level. Most organizations avoid running applications using their private data on systems that are not under their control until a sufficient confidence of trust is built. Proofs of information security help build a higher level of trust and thus increase the utilization of the shared cluster. When launch applications on computer systems, five potential security threats arise at user, protocol, system, communication and hardware levels. To secure information, each level has to execute a set of protection tasks. Full trust will be achieved after all levels are proven immune from attack. In a conventional system, security is guaranteed if the hosting system is wholly controlled by the applications. Therefore, to protect confidential data between applications in a shared system, the traditional approach is to separate the entire system by either spatial or time methods. Here we introduce a resource separating and grouping mechanism that physically and logically separates system resources by adaptable scale to eliminate security problems and reduce the overall cost","PeriodicalId":419226,"journal":{"name":"Sixth IEEE International Symposium on Cluster Computing and the Grid (CCGRID'06)","volume":"53 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2006-05-16","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"127170697","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}