Rajesh Kalyanam, Lan Zhao, Taezoon Park, S. Goasguen
{"title":"A Web Service-Enabled Distributed Workflow System for Scientific Data Processing","authors":"Rajesh Kalyanam, Lan Zhao, Taezoon Park, S. Goasguen","doi":"10.1109/FTDCS.2007.9","DOIUrl":null,"url":null,"abstract":"This paper presents the design and implementation of a distributed data-driven workflow system on top of the TeraGrid infrastructure. The workflow system is based on a data management architecture that provides easy access to scientific data collections via the TeraGrid network. The workflow system allows researchers to construct scientific workflows for data discovery, access, transformation, and analysis. The system leverages JOpera, an open-source workflow engine and visual composer, as well as a set of Web service-based data and computation modules. To demonstrate its effectiveness, we create an end-to-end climate simulation data analysis workflow that connects the data management architecture to TeraGrid computation resources. We also develop a workflow monitoring service to keep track of distributed workflow execution","PeriodicalId":199987,"journal":{"name":"11th IEEE International Workshop on Future Trends of Distributed Computing Systems (FTDCS'07)","volume":"13 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2007-03-21","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"7","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"11th IEEE International Workshop on Future Trends of Distributed Computing Systems (FTDCS'07)","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/FTDCS.2007.9","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 7
Abstract
This paper presents the design and implementation of a distributed data-driven workflow system on top of the TeraGrid infrastructure. The workflow system is based on a data management architecture that provides easy access to scientific data collections via the TeraGrid network. The workflow system allows researchers to construct scientific workflows for data discovery, access, transformation, and analysis. The system leverages JOpera, an open-source workflow engine and visual composer, as well as a set of Web service-based data and computation modules. To demonstrate its effectiveness, we create an end-to-end climate simulation data analysis workflow that connects the data management architecture to TeraGrid computation resources. We also develop a workflow monitoring service to keep track of distributed workflow execution