{"title":"User-Oriented Querying over Repositories of Data and Provenance","authors":"B. Baliś, M. Bubak, J. Wach","doi":"10.1109/E-SCIENCE.2007.81","DOIUrl":"https://doi.org/10.1109/E-SCIENCE.2007.81","url":null,"abstract":"We propose an end-user oriented approach to querying repositories of data and provenance in e-Science environments. The approach is based on ontology models describing multiple domains - in silico experiments, provenance, data, and applications. Those ontologies, integrated in a unified model and containing mappings to underlying data models, allow to query repositories of data and provenance in a unified way, or even combine provenance and data aspects in one query. We demonstrate QUery TRanslation tools (QUaTRo), built on top of the ontology models, which allow to construct complex queries over both data and provenance repositories, expressed in the terms of the domain familiar to end users. We present, in the context of the ViroLab virtual laboratory for infectious diseases, examples of construction of complex queries, combining provenance and data model aspects, which can be of practical value to scientists or medical users.","PeriodicalId":185690,"journal":{"name":"Third IEEE International Conference on e-Science and Grid Computing (e-Science 2007)","volume":"39 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2007-12-10","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"116032353","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
G. Compostella, D. Lucchesi, S. P. Griso, I. Sfiligoi
{"title":"CDF Monte Carlo Production on LCG Grid via LcgCAF Portal","authors":"G. Compostella, D. Lucchesi, S. P. Griso, I. Sfiligoi","doi":"10.1109/E-SCIENCE.2007.18","DOIUrl":"https://doi.org/10.1109/E-SCIENCE.2007.18","url":null,"abstract":"The improvements of the luminosity of the Tevatron Collider require large increases in computing requirements for the CDF experiment which has to be able to increase proportionally the amount of Monte Carlo data it produces. This is, in turn, forcing the CDF collaboration to move beyond the use of dedicated resources and to exploit grid resources. CDF has been running a set of CDF Analysis Farm (CAFs), which are submission portals to dedicated pools, and LcgCAF is basically a reimplementation of the CAF model in order to access grid resources by using the LCG/EGEE middleware components. By mean of LcgCAF CDF users can submit analysis jobs with the same mechanism adopted for the dedicated farms and at the same time the grid resources are accessed without any specific software requirements for the sites. This is obtained using Parrot for the experiment code distribution and Frontier for the run condition database availability on the worker nodes. Currently many sites in Italy and in Europe are accessed through this portal in order to produce Monte Carlo data and in one year of operations we expect about 100,000 grid jobs submitted by the CDF users. We review here the setup used to submit jobs and retrieve the output, including the grid components CDF-specific configuration. The batch and interactive monitor tools developed to allow users to verify the jobs status during their lifetimes in the grid environment are described. We analyze the efficiency and typical failure modes of the current grid infrastructure reporting the performances of different parts of the used system.","PeriodicalId":185690,"journal":{"name":"Third IEEE International Conference on e-Science and Grid Computing (e-Science 2007)","volume":"21 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2007-12-10","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"122430383","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Croquet Based Virtual Museum Implementation with Grid Computing Connection","authors":"R. F. Sari, Patrick Pabeda","doi":"10.1109/E-SCIENCE.2007.24","DOIUrl":"https://doi.org/10.1109/E-SCIENCE.2007.24","url":null,"abstract":"A 3D computation technology in the form of Virtual Reality enables user to access ancient artifacts and facilitates the feel of presence. Virtual Reality consumes a lot of computing resources. Grid computing can be used to manage the distributed computation resources to perform computational processes Croquet application is used in this work to provide a virtual museum which will store an ancient Java manuscript. Croquet is a virtual machine which can be programmed for a collaborative 3 dimension application. The collaboration in virtual world can be conducted for multi users. In this work, we have created a virtual museum using a 3 dimension processing application support using 3D Studio Max. The Croquet application has been connected to a grid computing based on Globus and JOGL based manuscript system through Virtual Network Computing. A user acceptance test was conducted and the result indicated that the users where satisfied with the application performance, although Croquet is still rarely used despite its usefulness for Virtual Reality. The connection between the VR world and Globus based Grid Computing System for a 3D manuscript has successfully been implemented, despite of the slow processing in the system.","PeriodicalId":185690,"journal":{"name":"Third IEEE International Conference on e-Science and Grid Computing (e-Science 2007)","volume":"35 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2007-12-10","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"122173923","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
A. Carbone, L. dell'Agnello, A. Forti, A. Ghiselli, E. Lanciotti, L. Magnoni, M. Mazzucato, R. Santinelli, V. Sapunenko, V. Vagnoni, R. Zappi
{"title":"Performance Studies of the StoRM Storage Resource Manager","authors":"A. Carbone, L. dell'Agnello, A. Forti, A. Ghiselli, E. Lanciotti, L. Magnoni, M. Mazzucato, R. Santinelli, V. Sapunenko, V. Vagnoni, R. Zappi","doi":"10.1109/E-SCIENCE.2007.59","DOIUrl":"https://doi.org/10.1109/E-SCIENCE.2007.59","url":null,"abstract":"High performance disk-storage solutions based on parallel file systems are becoming increasingly important to fulfill the large I/O throughput required by high-energy physics applications. Storage area networks (SAN) are commonly employed at the Large Hadron Collider data centres, and SAN-oriented parallel file systems such as GPFS and Lustre provide high scalability and availability by aggregating many data volumes served by multiple disk-servers into a single POSIX file system hierarchy. Since these file systems do not come with a storage resource manager (SRM) interface, necessary to access and manage the data volumes in a grid environment, a specific project called StoRM has been developed for providing them with the necessary SRM capabilities. In this paper we describe the deployment of a StoRM instance, configured to manage a GPFS file system. A software suite has been realized in order to perform stress tests of functionality and throughput on StoRM. We present the results of these tests.","PeriodicalId":185690,"journal":{"name":"Third IEEE International Conference on e-Science and Grid Computing (e-Science 2007)","volume":"1 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2007-12-10","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"129728991","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Performance Evaluation of Scheduling Policies for Volunteer Computing","authors":"Derrick Kondo, David P. Anderson, J. McLeod","doi":"10.1109/E-SCIENCE.2007.57","DOIUrl":"https://doi.org/10.1109/E-SCIENCE.2007.57","url":null,"abstract":"BOINC, a middleware system for volunteer computing, allows hosts to be attached to multiple projects. Each host periodically requests jobs from project servers and executes the jobs. This process involves three interrelated policies: 1) of the runnable jobs on a host, which to execute? 2) when and from what project should a host request more work? 3) what jobs should a server send in response to a given request? 4) How to estimate the remaining runtime of a job? In this paper, we consider several alternatives for each of these policies. Using simulation, we study various combinations of policies, comparing them on the basis of several performance metrics and over a range of parameters such as job length variability, deadline slack, and number of attached projects.","PeriodicalId":185690,"journal":{"name":"Third IEEE International Conference on e-Science and Grid Computing (e-Science 2007)","volume":"143 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2007-12-10","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"121298628","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"SOAs for Scientific Applications: Experiences and Challenges","authors":"S. Krishnan, K. Bhatia","doi":"10.1109/E-SCIENCE.2007.69","DOIUrl":"https://doi.org/10.1109/E-SCIENCE.2007.69","url":null,"abstract":"Over the past several years, with the advent of the open grid services architecture (OGSA) and the Web services resource framework (WSRF), service-oriented architectures (SOA) and Web service technologies have been embraced in the field of scientific and grid computing. These new principles promise to help make scientific infrastructures simpler to use, more cost effective to implement, and easier to maintain. However, understanding how to leverage these developments to actually design and build a system remains more of an art than a science. In this paper, we present some positions learned through experience that provide guidance in leveraging SOA technologies to build scientific infrastructures. In addition, we present the technical challenges that need to be addressed in building an SOA, and as a case study, we present the SOA that we have designed for the national biomedical computation resource (NBCR) community. We discuss how we have addressed these technical challenges, and present the overall architecture, the individual software toolkits developed, the client interfaces, and the usage scenarios. We hope that our experiences prove to be useful in building similar infrastructures for other scientific applications.","PeriodicalId":185690,"journal":{"name":"Third IEEE International Conference on e-Science and Grid Computing (e-Science 2007)","volume":"3 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2007-12-10","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"125640272","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
S. Miles, E. Deelman, Paul T. Groth, K. Vahi, Gaurang Mehta, L. Moreau
{"title":"Connecting Scientific Data to Scientific Experiments with Provenance","authors":"S. Miles, E. Deelman, Paul T. Groth, K. Vahi, Gaurang Mehta, L. Moreau","doi":"10.1109/E-SCIENCE.2007.22","DOIUrl":"https://doi.org/10.1109/E-SCIENCE.2007.22","url":null,"abstract":"As scientific workflows and the data they operate on, grow in size and complexity, the task of defining how those workflows should execute (which resources to use, where the resources must be in readiness for processing etc.) becomes proportionally more difficult. While \"workflow compilers\", such as Pegasus, reduce this burden, a further problem arises: since specifying details of execution is now automatic, a workflow's results are harder to interpret, as they are partly due to specifics of execution. By automating steps between the experiment design and its results, we lose the connection between them, hindering interpretation of results. To reconnect the scientific data with the original experiment, we argue that scientists should have access to the full provenance of their data, including not only parameters, inputs and intermediary data, but also the abstract experiment, refined into a concrete execution by the \"workflow compiler\". In this paper, we describe preliminary work on adapting Pegasus to capture the process of workflow refinement in the PASOA provenance system.","PeriodicalId":185690,"journal":{"name":"Third IEEE International Conference on e-Science and Grid Computing (e-Science 2007)","volume":"128 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2007-12-10","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"125777658","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Artem Chebotko, Xubo Fei, Cui Lin, Shiyong Lu, F. Fotouhi
{"title":"Storing and Querying Scientific Workflow Provenance Metadata Using an RDBMS","authors":"Artem Chebotko, Xubo Fei, Cui Lin, Shiyong Lu, F. Fotouhi","doi":"10.1109/E-SCIENCE.2007.70","DOIUrl":"https://doi.org/10.1109/E-SCIENCE.2007.70","url":null,"abstract":"Provenance management has become increasingly important to support scientific discovery reproducibility, result interpretation, and problem diagnosis in scientific workflow environments. This paper proposes an approach to provenance management that seamlessly integrates the interoperability, extensibility, and reasoning advantages of semantic Web technologies with the storage and querying power of an RDBMS. Specifically, we propose: i) two schema mapping algorithms to map an arbitrary OWL provenance ontology to a relational database schema that is optimized for common provenance queries; ii) two efficient data mapping algorithms to map provenance RDF metadata to relational data according to the generated relational database schema, and iii) a schema-independent SPARQL-to-SQL translation algorithm that is optimized on-the-fly by using the type information of an instance available from the input provenance ontology and the statistics of the sizes of the tables in the database. Experimental results are presented to show that our algorithms are efficient and scalable.","PeriodicalId":185690,"journal":{"name":"Third IEEE International Conference on e-Science and Grid Computing (e-Science 2007)","volume":"47 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2007-12-10","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"126005659","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Semantic-Based On-demand Synthesis of Grid Activities for Automatic Workflow Generation","authors":"M. Siddiqui, A. Villazón, T. Fahringer","doi":"10.1109/E-SCIENCE.2007.68","DOIUrl":"https://doi.org/10.1109/E-SCIENCE.2007.68","url":null,"abstract":"On-demand synthesis of grid activities can play a significant role in automatic workflow composition and in improving quality of the grid resource provisioning. However, in the grid, synthesis of activities has been largely ignored due to the limited expressiveness of the representation of activity capabilities and the lack of adapted resource management means to take advantage of such activity synthesis. This paper introduces a new mechanism for automatic synthesis of available activities in the grid by applying ontology rules. Rule-based synthesis combines multiple primitive activities to form new compound activities. The synthesized activities can be provisioned as new or alternative options for negotiation as well as advance reservation. This is a major advantage compared to other approaches that only focus on resource matching and brokerage. Furthermore, the new synthesized activities provide aggregated capabilities that otherwise may not be possible, leading towards an automatic generation of grid workflows. We developed a prototype to demonstrate advantages of our approach.","PeriodicalId":185690,"journal":{"name":"Third IEEE International Conference on e-Science and Grid Computing (e-Science 2007)","volume":"123 5","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2007-12-10","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"131957140","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Yun Yang, Ke Liu, Jinjun Chen, Joel Lignier, Hai Jin
{"title":"Peer-to-Peer Based Grid Workflow Runtime Environment of SwinDeW-G","authors":"Yun Yang, Ke Liu, Jinjun Chen, Joel Lignier, Hai Jin","doi":"10.1109/E-SCIENCE.2007.56","DOIUrl":"https://doi.org/10.1109/E-SCIENCE.2007.56","url":null,"abstract":"Nowadays, grid and peer-to-peer (p2p) technologies have become popular solutions for large- scale resource sharing and system integration. For e- science workflow systems, grid is a convenient way of constructing new services by composing existing services, while p2p is an effective approach to eliminate the performance bottlenecks and enhance the scalability of the systems. However, existing workflow systems focus either on p2p or grid environments and therefore cannot take advantage of both technologies. It is desirable to incorporate the two technologies in workflow systems. SwinDeW-G (Swinburne Decentralised Workflow for Grid) is a novel hybrid decentralised workflow management system facilitating both grid and p2p technologies. It is derived from the former p2p based SwinDeW system but redeveloped as grid services with communications between peers conducted in a p2p fashion. This paper describes the system design and functions of the runtime environment of SwinDeW-G.","PeriodicalId":185690,"journal":{"name":"Third IEEE International Conference on e-Science and Grid Computing (e-Science 2007)","volume":"3 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2007-12-10","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"133966379","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}