A. Thrasher, Rory Carmichael, Peter Bui, Li Yu, D. Thain, S. Emrich
{"title":"驯服复杂的生物信息学工作流程与编织,制作流程,和淀粉","authors":"A. Thrasher, Rory Carmichael, Peter Bui, Li Yu, D. Thain, S. Emrich","doi":"10.1109/WORKS.2010.5671858","DOIUrl":null,"url":null,"abstract":"In this paper we discuss challenges of common bioinformatics applications when deployed outside their initial development environments. We propose a three-tiered approach to mitigate some of these issues by leveraging an encapsulation tool, a high-level workflow language, and a portable intermediary. As a case study, we apply this approach to refactor a custom EST analysis pipeline. The Starch tool encapsulates program dependencies to simplify task specification and deployment. The Weaver language provides abstractions for distributed computing and naturally encourages code modularity. The Makeflow workflow engine provides a batch system agnostic engine to execute compiled Weaver code. To illustrate the benefits of our framework, we compare implementations, show their performance, and discuss benefits derived from our new workflow approach relative to traditional bioinformatics development.","PeriodicalId":400999,"journal":{"name":"The 5th Workshop on Workflows in Support of Large-Scale Science","volume":"1 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2010-12-17","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"17","resultStr":"{\"title\":\"Taming complex bioinformatics workflows with weaver, makeflow, and starch\",\"authors\":\"A. Thrasher, Rory Carmichael, Peter Bui, Li Yu, D. Thain, S. Emrich\",\"doi\":\"10.1109/WORKS.2010.5671858\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"In this paper we discuss challenges of common bioinformatics applications when deployed outside their initial development environments. We propose a three-tiered approach to mitigate some of these issues by leveraging an encapsulation tool, a high-level workflow language, and a portable intermediary. As a case study, we apply this approach to refactor a custom EST analysis pipeline. The Starch tool encapsulates program dependencies to simplify task specification and deployment. The Weaver language provides abstractions for distributed computing and naturally encourages code modularity. The Makeflow workflow engine provides a batch system agnostic engine to execute compiled Weaver code. To illustrate the benefits of our framework, we compare implementations, show their performance, and discuss benefits derived from our new workflow approach relative to traditional bioinformatics development.\",\"PeriodicalId\":400999,\"journal\":{\"name\":\"The 5th Workshop on Workflows in Support of Large-Scale Science\",\"volume\":\"1 1\",\"pages\":\"0\"},\"PeriodicalIF\":0.0000,\"publicationDate\":\"2010-12-17\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"17\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"The 5th Workshop on Workflows in Support of Large-Scale Science\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/10.1109/WORKS.2010.5671858\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"The 5th Workshop on Workflows in Support of Large-Scale Science","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/WORKS.2010.5671858","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
Taming complex bioinformatics workflows with weaver, makeflow, and starch
In this paper we discuss challenges of common bioinformatics applications when deployed outside their initial development environments. We propose a three-tiered approach to mitigate some of these issues by leveraging an encapsulation tool, a high-level workflow language, and a portable intermediary. As a case study, we apply this approach to refactor a custom EST analysis pipeline. The Starch tool encapsulates program dependencies to simplify task specification and deployment. The Weaver language provides abstractions for distributed computing and naturally encourages code modularity. The Makeflow workflow engine provides a batch system agnostic engine to execute compiled Weaver code. To illustrate the benefits of our framework, we compare implementations, show their performance, and discuss benefits derived from our new workflow approach relative to traditional bioinformatics development.