2018 IEEE/ACM Workflows in Support of Large-Scale Science (WORKS)最新文献

Flux: Overcoming Scheduling Challenges for Exascale Workflows Flux:克服百亿亿级工作流的调度挑战

2018 IEEE/ACM Workflows in Support of Large-Scale Science (WORKS) Pub Date : 2018-11-01 DOI: 10.1109/WORKS.2018.00007

D. Ahn, Ned Bass, Albert Chu, J. Garlick, Mark Grondona, Stephen Herbein, Helgi I. Ingólfsson, Joseph Koning, Tapasya Patki, T. Scogland, B. Springmeyer, M. Taufer

引用次数: 57

DagOn*: Executing Direct Acyclic Graphs as Parallel Jobs on Anything DagOn*:将直接无环图作为任何并行作业执行

2018 IEEE/ACM Workflows in Support of Large-Scale Science (WORKS) Pub Date : 2018-11-01 DOI: 10.1109/WORKS.2018.00012

R. Montella, D. Di Luccio, Sokol Kosta

引用次数: 24

Planner: Cost-Efficient Execution Plans Placement for Uniform Stream Analytics on Edge and Cloud Planner:在边缘和云上统一流分析的成本效益执行计划放置

2018 IEEE/ACM Workflows in Support of Large-Scale Science (WORKS) Pub Date : 2018-11-01 DOI: 10.1109/WORKS.2018.00010

Laurent Prosperi, Alexandru Costan, Pedro Silva, Gabriel Antoniu

引用次数: 12

WRENCH: A Framework for Simulating Workflow Management Systems 一个模拟工作流管理系统的框架

2018 IEEE/ACM Workflows in Support of Large-Scale Science (WORKS) Pub Date : 2018-11-01 DOI: 10.1109/WORKS.2018.00013

H. Casanova, Suraj Pandey, James Oeth, Ryan Tanaka, F. Suter, Rafael Ferreira da Silva

{"title":"WRENCH: A Framework for Simulating Workflow Management Systems","authors":"H. Casanova, Suraj Pandey, James Oeth, Ryan Tanaka, F. Suter, Rafael Ferreira da Silva","doi":"10.1109/WORKS.2018.00013","DOIUrl":"https://doi.org/10.1109/WORKS.2018.00013","url":null,"abstract":"Scientific workflows are used routinely in numerous scientific domains, and Workflow Management Systems (WMSs) have been developed to orchestrate and optimize workflow executions on distributed platforms. WMSs are complex software systems that interact with complex software infrastructures. Most WMS research and development activities rely on empirical experiments conducted with full-fledged software stacks on actual hardware platforms. Such experiments, however, are limited to hardware and software infrastructures at hand and can be labor- and/or time-intensive. As a result, relying solely on real-world experiments impedes WMS research and development. An alternative is to conduct experiments in simulation. In this work we present WRENCH, a WMS simulation framework, whose objectives are (i) accurate and scalable simulations; and (ii) easy simulation software development. WRENCH achieves its first objective by building on the SimGrid framework. While SimGrid is recognized for the accuracy and scalability of its simulation models, it only provides low-level simulation abstractions and thus large software development efforts are required when implementing simulators of complex systems. WRENCH thus achieves its second objective by providing high- level and directly re-usable simulation abstractions on top of SimGrid. After describing and giving rationales for WRENCH’s software architecture and APIs, we present a case study in which we apply WRENCH to simulate the Pegasus production WMS. We report on ease of implementation, simulation accuracy, and simulation scalability so as to determine to which extent WRENCH achieves its two above objectives. We also draw both qualitative and quantitative comparisons with a previously proposed workflow simulator.","PeriodicalId":154317,"journal":{"name":"2018 IEEE/ACM Workflows in Support of Large-Scale Science (WORKS)","volume":"20 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2018-11-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"134320929","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 22

A Practical Roadmap for Provenance Capture and Data Analysis in Spark-Based Scientific Workflows 基于spark的科学工作流中来源捕获和数据分析的实用路线图

2018 IEEE/ACM Workflows in Support of Large-Scale Science (WORKS) Pub Date : 2018-11-01 DOI: 10.1109/WORKS.2018.00009

Thaylon Guedes, V. Silva, M. Mattoso, Marcos V. N. Bedo, Daniel de Oliveira

{"title":"A Practical Roadmap for Provenance Capture and Data Analysis in Spark-Based Scientific Workflows","authors":"Thaylon Guedes, V. Silva, M. Mattoso, Marcos V. N. Bedo, Daniel de Oliveira","doi":"10.1109/WORKS.2018.00009","DOIUrl":"https://doi.org/10.1109/WORKS.2018.00009","url":null,"abstract":"Whenever high-performance computing applications meet data-intensive scalable systems, an attractive approach is the use of Apache Spark for the management of scientific workflows. Spark provides several advantages such as being widely supported and granting efficient in-memory data management for large-scale applications. However, Spark still lacks support for data tracking and workflow provenance. Additionally, Spark's memory management requires accessing all data movements between the workflow activities. Therefore, the running of legacy programs on Spark is interpreted as a \"black-box\" activity, which prevents the capture and analysis of implicit data movements. Here, we present SAMbA, an Apache Spark extension for the gathering of prospective and retrospective provenance and domain data within distributed scientific workflows. Our approach relies on enveloping both RDD structure and data contents at runtime so that (i) RDD-enclosure consumed and produced data are captured and registered by SAMbA in a structured way, and (ii) provenance data can be queried during and after the execution of scientific workflows. By following the W3C PROV representation, we model the roles of RDD regarding prospective and retrospective provenance data. Our solution provides mechanisms for the capture and storage of provenance data without jeopardizing Spark's performance. The provenance retrieval capabilities of our proposal are evaluated in a practical case study, in which data analytics are provided by several SAMbA parameterizations.","PeriodicalId":154317,"journal":{"name":"2018 IEEE/ACM Workflows in Support of Large-Scale Science (WORKS)","volume":"266 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2018-11-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"116588160","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 13

Reduction of Workflow Resource Consumption Using a Density-based Clustering Model 基于密度的聚类模型降低工作流资源消耗

2018 IEEE/ACM Workflows in Support of Large-Scale Science (WORKS) Pub Date : 2018-11-01 DOI: 10.1109/WORKS.2018.00006

Qimin Zhang, Nathaniel Kremer-Herman, Benjamín Tovar, D. Thain

{"title":"Reduction of Workflow Resource Consumption Using a Density-based Clustering Model","authors":"Qimin Zhang, Nathaniel Kremer-Herman, Benjamín Tovar, D. Thain","doi":"10.1109/WORKS.2018.00006","DOIUrl":"https://doi.org/10.1109/WORKS.2018.00006","url":null,"abstract":"Often times, a researcher running a scientific workflow will ask for orders of magnitude too few or too many resources to run their workflow. If the resource requisition is too small, the job may fail due to resource exhaustion; if it is too large, resources will be wasted though job may succeed. It would be ideal to achieve a near-optimal number of resources the workflow runs to ensure all jobs succeed and minimize resource waste. We present a strategy for solving the resource allocation problem: (1) resources consumed by each job are recorded by a resource monitor tool; (2) a density-based clustering model is proposed for discovering clusters in all jobs; (3) a maximal resource requisition is calculated as the ideal number of each cluster. We ran experiments with a synthetic workflow of homogeneous tasks as well as the bioinformatics tools Lifemapper, SHRIMP, BWA and BWA-GATK to capture the inherent nature of resource consumption of a workflow, the clustering allowed by the model, and its usefulness in real workflows. In Lifemapper, the least time saving, cores saving, memory saving, and disk saving are 13.82%, 16.62%, 49.15%, and 93.89%, respectively. In SHRIMP, BWA, and BWA-GATK, the least cores saving, memory saving and disk saving are 50%, 90.14%, and 51.82%, respectively. Compared with fixed resource allocation strategy, our approach provide a noticeable reduction of workflow resource consumption.","PeriodicalId":154317,"journal":{"name":"2018 IEEE/ACM Workflows in Support of Large-Scale Science (WORKS)","volume":"1 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2018-11-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"130242489","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 7

Dynamic Distributed Orchestration of Node-RED IoT Workflows Using a Vector Symbolic Architecture 使用矢量符号架构的Node-RED物联网工作流的动态分布式编排

2018 IEEE/ACM Workflows in Support of Large-Scale Science (WORKS) Pub Date : 2018-11-01 DOI: 10.1109/WORKS.2018.00011

Christopher Simpkin, I. Taylor, Daniel Harborne, G. Bent, A. Preece, Ragu K. Ganti

{"title":"Dynamic Distributed Orchestration of Node-RED IoT Workflows Using a Vector Symbolic Architecture","authors":"Christopher Simpkin, I. Taylor, Daniel Harborne, G. Bent, A. Preece, Ragu K. Ganti","doi":"10.1109/WORKS.2018.00011","DOIUrl":"https://doi.org/10.1109/WORKS.2018.00011","url":null,"abstract":"There are a large number of workflow systems designed to work in various scientific domains, including support for the Internet of Things (IoT). One such workflow system is Node-RED, which is designed to bring workflow-based programming to IoT. However, the majority of scientific workflow systems, and specifically systems like Node-RED, are designed to operate in a fixed networked environment, which rely on a central point of coordination in order to manage the workflow. The main focus of the work described in this paper is to investigate means whereby we can migrate Node-RED workflows into a decentralized execution environment, so that such workflows can run on Edge networks, where nodes are extremely transient in nature. In this work, we demonstrate the feasibility of such an approach by showing how we can migrate a Node-RED based traffic congestion detection workflow into a decentralized environment. The detection algorithm is implemented as a set of Web services within Node-RED and we have architected and implemented a system that proxies the centralized Node-RED services using cognitively-aware wrapper services, designed to operate in a decentralized environment. Our cognitive services use a Vector Symbolic Architecture to semantically represent service descriptions and workflows in a way that can be unraveled on the fly without any central point of control. The VSA-based system is capable of parsing Node-RED workflows and migrating them to a decentralized environment for execution; providing a way to use Node-RED as a front-end graphical composition tool for decentralized workflows.","PeriodicalId":154317,"journal":{"name":"2018 IEEE/ACM Workflows in Support of Large-Scale Science (WORKS)","volume":"12 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2018-11-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"130857619","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 11

LOS: Level Order Sampling for Task Graph Scheduling on Heterogeneous Resources LOS:异构资源任务图调度的层次顺序抽样

2018 IEEE/ACM Workflows in Support of Large-Scale Science (WORKS) Pub Date : 2018-11-01 DOI: 10.1109/WORKS.2018.00008

Carl Witt, Sam Wheating, U. Leser

引用次数: 0

Title Page 标题页

2018 IEEE/ACM Workflows in Support of Large-Scale Science (WORKS) Pub Date : 2018-11-01 DOI: 10.1109/works.2018.00001

引用次数: 0