{"title":"科学工作流的有效资源复用","authors":"T. Ryan, Young Choon Lee","doi":"10.1109/APNOMS.2015.7275432","DOIUrl":null,"url":null,"abstract":"Scientific workflows feature complex precedence constraints that are mostly dictated by data dependencies between tasks. The inter-task communication (data staging) in these complex workflow applications incurs significant overheads resulting in a major hindering factor of high performance and effective resource utilization. As the scale of these applications becomes increasingly large due primarily to the recent explosive growth of data, addressing this hindrance is of great practical importance. In this paper, we present a resource multiplexing (RM) technique, which leverages data staging aiming to minimize idle times between execution of tasks due to inter-task communication overheads. In particular, we incorporate RM into our DEWE framework 1 with making a set of extensions to the framework. The rationale behind RM is each slot or core pairs up the actual workflow task and the RM-enabled file loading DEWE extension (File Client) in the way that their resource usage is complementary. We demonstrate the efficacy of our multiplexing technique in a data-intensive computing environment using an astronomy application. Our results from experiments conducted in Amazon EC2 demonstrate that our multiplexing technique is effective with the reduction in resource idle time between jobs by 57% on average and up to 91%.","PeriodicalId":269263,"journal":{"name":"2015 17th Asia-Pacific Network Operations and Management Symposium (APNOMS)","volume":"57 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2015-08-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"2","resultStr":"{\"title\":\"Effective resource multiplexing for scientific workflows\",\"authors\":\"T. Ryan, Young Choon Lee\",\"doi\":\"10.1109/APNOMS.2015.7275432\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"Scientific workflows feature complex precedence constraints that are mostly dictated by data dependencies between tasks. The inter-task communication (data staging) in these complex workflow applications incurs significant overheads resulting in a major hindering factor of high performance and effective resource utilization. As the scale of these applications becomes increasingly large due primarily to the recent explosive growth of data, addressing this hindrance is of great practical importance. In this paper, we present a resource multiplexing (RM) technique, which leverages data staging aiming to minimize idle times between execution of tasks due to inter-task communication overheads. In particular, we incorporate RM into our DEWE framework 1 with making a set of extensions to the framework. The rationale behind RM is each slot or core pairs up the actual workflow task and the RM-enabled file loading DEWE extension (File Client) in the way that their resource usage is complementary. We demonstrate the efficacy of our multiplexing technique in a data-intensive computing environment using an astronomy application. Our results from experiments conducted in Amazon EC2 demonstrate that our multiplexing technique is effective with the reduction in resource idle time between jobs by 57% on average and up to 91%.\",\"PeriodicalId\":269263,\"journal\":{\"name\":\"2015 17th Asia-Pacific Network Operations and Management Symposium (APNOMS)\",\"volume\":\"57 1\",\"pages\":\"0\"},\"PeriodicalIF\":0.0000,\"publicationDate\":\"2015-08-01\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"2\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"2015 17th Asia-Pacific Network Operations and Management Symposium (APNOMS)\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/10.1109/APNOMS.2015.7275432\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"2015 17th Asia-Pacific Network Operations and Management Symposium (APNOMS)","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/APNOMS.2015.7275432","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
Effective resource multiplexing for scientific workflows
Scientific workflows feature complex precedence constraints that are mostly dictated by data dependencies between tasks. The inter-task communication (data staging) in these complex workflow applications incurs significant overheads resulting in a major hindering factor of high performance and effective resource utilization. As the scale of these applications becomes increasingly large due primarily to the recent explosive growth of data, addressing this hindrance is of great practical importance. In this paper, we present a resource multiplexing (RM) technique, which leverages data staging aiming to minimize idle times between execution of tasks due to inter-task communication overheads. In particular, we incorporate RM into our DEWE framework 1 with making a set of extensions to the framework. The rationale behind RM is each slot or core pairs up the actual workflow task and the RM-enabled file loading DEWE extension (File Client) in the way that their resource usage is complementary. We demonstrate the efficacy of our multiplexing technique in a data-intensive computing environment using an astronomy application. Our results from experiments conducted in Amazon EC2 demonstrate that our multiplexing technique is effective with the reduction in resource idle time between jobs by 57% on average and up to 91%.