Isabel Rosseti, Kary A. C. S. Ocaña, Daniel de Oliveira
{"title":"Towards preserving results confidentiality in cloud-based scientific workflows","authors":"Isabel Rosseti, Kary A. C. S. Ocaña, Daniel de Oliveira","doi":"10.1145/3150994.3151002","DOIUrl":"https://doi.org/10.1145/3150994.3151002","url":null,"abstract":"Cloud computing has established itself as a solid computational model that allows for scientists to deploy their simulation-based experiments on distributed virtual resources to execute a wide range of scientific experiments. These experiments can be modeled as scientific workflows. Many of these workflows are data-intensive and produce a large volume of data, which is also stored in the cloud using storage services by Scientific Workflow Management Systems (SWfMS). One main issue regarding cloud storage services is confidentiality of stored data, i.e. if unauthorized people access data files they can infer knowledge about the results or even about the workflow structure. Encryption is a possible solution, but it may not be be sufficient and a new level of security can be added to preserve data confidentiality: data dispersion. In order to reduce this risk, generated data files cannot be stored in the same bucket, or at least sensitive data files have to be distributed across many cloud storage. In this paper, we present IPConf, an approach to preserve workflow results confidentiality in cloud storage. IPConf generates a distribution plan for data files generated during a workflow execution. This plan disperses data files in several cloud storage to preserve confidentiality. This distribution plan is then sent to the SWfMS that effectively stores generated data into specific buckets during workflow execution. Experiments performed using real data from SciPhy workflow executions indicate the potential of the proposed approach.","PeriodicalId":228111,"journal":{"name":"Proceedings of the 12th Workshop on Workflows in Support of Large-Scale Science","volume":"6 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2017-11-12","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"131432815","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"A compiler transformation-based approach to scientific workflow enactment","authors":"Matthias Janetschek, R. Prodan","doi":"10.1145/3150994.3150999","DOIUrl":"https://doi.org/10.1145/3150994.3150999","url":null,"abstract":"We investigate in this paper the application of compiler transformations to workflow applications using the Manycore Workflow Runtime Environment (MWRE), a compiler-based workflow environment for modern manycore computing architectures. MWRE translates scientific workflows into equivalent C++ programs and efficiently executes them using a novel callback mechanism for dependency resolution and data transfers, with explicit support for full-ahead scheduling. We evaluate four different classes of compiler transformations, analyse their advantages and possible solutions to overcome their limitations, and present experimental results for improving the performance of a combination of real-world and synthetic workflows through compiler transformations. Our experiments were able to improve the workflow enactment by a factor of two and to reduce the memory usage of the engine by up to 33%. We achieved a speedup of up to 1.7 by eliminating unnecessary activity invocations, an improved parallel throughput up to 2.8 times by transforming the workflow structure, and a better performance of the HEFT scheduling algorithm by up to 36%.","PeriodicalId":228111,"journal":{"name":"Proceedings of the 12th Workshop on Workflows in Support of Large-Scale Science","volume":"9 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2017-11-12","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"131421744","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
William Fox, D. Ghoshal, Abel Souza, G. Rodrigo, L. Ramakrishnan
{"title":"E-HPC: a library for elastic resource management in HPC environments","authors":"William Fox, D. Ghoshal, Abel Souza, G. Rodrigo, L. Ramakrishnan","doi":"10.1145/3150994.3150996","DOIUrl":"https://doi.org/10.1145/3150994.3150996","url":null,"abstract":"Next-generation data-intensive scientific workflows need to support streaming and real-time applications with dynamic resource needs on high performance computing (HPC) platforms. The static resource allocation model on current HPC systems that was designed for monolithic MPI applications is insufficient to support the elastic resource needs of current and future workflows. In this paper, we discuss the design, implementation and evaluation of Elastic-HPC (E-HPC), an elastic framework for managing resources for scientific workflows on current HPC systems. E-HPC considers a resource slot for a workflow as an elastic window that might map to different physical resources over the duration of a workflow. Our framework uses checkpoint-restart as the underlying mechanism to migrate workflow execution across the dynamic window of resources. E-HPC provides the foundation necessary to enable dynamic resource allocation of HPC resources that are needed for streaming and real-time workflows. E-HPC has negligible overhead beyond the cost of checkpointing. Additionally, E-HPC results in decreased turnaround time of workflows compared to traditional model of resource allocation for workflows, where resources are allocated per stage of the workflow. Our evaluation shows that E-HPC improves core hour utilization for common workflow resource use patterns and provides an effective framework for elastic expansion of resources for applications with dynamic resource needs.","PeriodicalId":228111,"journal":{"name":"Proceedings of the 12th Workshop on Workflows in Support of Large-Scale Science","volume":"39 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2017-11-12","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"123420303","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"A machine learning approach for modular workflow performance prediction","authors":"Alok Singh, A. Rao, Shweta Purawat, I. Altintas","doi":"10.1145/3150994.3150998","DOIUrl":"https://doi.org/10.1145/3150994.3150998","url":null,"abstract":"Scientific workflows provide an opportunity for declarative computational experiment design in an intuitive and efficient way. A distributed workflow is typically executed on a variety of resources and it uses a variety of computational algorithms or tools to achieve the desired outcomes. Such a variety imposes additional complexity in scheduling these workflows on large scale computers. As computation becomes more distributed, insights into expected workload that a workflow presents become critical for effective resource allocation. In this paper, we present a modular framework that leverages Machine Learning for creating precise performance predictions of a workflow. The central idea is to partition a workflow in such a way that makes the task of forecasting each atomic unit manageable and gives us a way to combine the individual predictions efficiently. We recognize a combination of an executable and a specific physical resource as a single module. This gives us a handle to characterize workload and machine power as a single unit of prediction. The modular approach of the presented framework allows it to adapt to highly complex nested workflows and scale to new scenarios. We present performance estimation results of independent workflow modules executed on the XSEDE SDSC Comet cluster using various Machine Learning algorithms. The results provide insights into the behavior and effectiveness of different algorithms in the context of scientific workflow performance prediction.","PeriodicalId":228111,"journal":{"name":"Proceedings of the 12th Workshop on Workflows in Support of Large-Scale Science","volume":"16 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2017-11-12","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"127164068","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
S. Callaghan, G. Juve, K. Vahi, P. Maechling, T. Jordan, E. Deelman
{"title":"rvGAHP: push-based job submission using reverse SSH connections","authors":"S. Callaghan, G. Juve, K. Vahi, P. Maechling, T. Jordan, E. Deelman","doi":"10.1145/3150994.3151003","DOIUrl":"https://doi.org/10.1145/3150994.3151003","url":null,"abstract":"Computational science researchers running large-scale scientific workflow applications often want to run their workflows on the largest available compute systems to improve time to solution. Workflow tools used in distributed, heterogeneous, high performance computing environments typically rely on either a push-based or a pull-based approach for resource provisioning from these compute systems. However, many large clusters have moved to two-factor authentication for job submission, making traditional automated push-based job submission impossible. On the other hand, pull-based approaches such as pilot jobs may lead to increased complexity and a reduction in node-hour efficiency. In this paper, we describe a new, efficient approach based on HTCondor-G called reverse GAHP (rvGAHP) that allows us to push jobs using reverse SSH submissions with better efficiency than pull-based methods. We successfully used this approach to perform a large probabilistic seismic hazard analysis study using SCEC's CyberShake workflow in March 2017 on the Titan Cray XK7 hybrid system at Oak Ridge National Laboratory.","PeriodicalId":228111,"journal":{"name":"Proceedings of the 12th Workshop on Workflows in Support of Large-Scale Science","volume":"75 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2017-11-12","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"127284040","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Rafael Ferreira da Silva, S. Callaghan, E. Deelman
{"title":"On the use of burst buffers for accelerating data-intensive scientific workflows","authors":"Rafael Ferreira da Silva, S. Callaghan, E. Deelman","doi":"10.1145/3150994.3151000","DOIUrl":"https://doi.org/10.1145/3150994.3151000","url":null,"abstract":"Science applications frequently produce and consume large volumes of data, but delivering this data to and from compute resources can be challenging, as parallel file system performance is not keeping up with compute and memory performance. To mitigate this I/O bottleneck, some systems have deployed burst buffers, but their impact on performance for real-world workflow applications is not always clear. In this paper, we examine the impact of burst buffers through the remote-shared, allocatable burst buffers on the Cori system at NERSC. By running a subset of the SCEC CyberShake workflow, a production seismic hazard analysis workflow, we find that using burst buffers offers read and write improvements of about an order of magnitude, and these improvements lead to increased job performance, even for long-running CPU-bound jobs.","PeriodicalId":228111,"journal":{"name":"Proceedings of the 12th Workshop on Workflows in Support of Large-Scale Science","volume":"144 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2017-11-12","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"123476392","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
R. Montella, D. Luccio, L. Marcellino, A. Galletti, Sokol Kosta, A. Brizius, Ian T Foster
{"title":"Processing of crowd-sourced data from an internet of floating things","authors":"R. Montella, D. Luccio, L. Marcellino, A. Galletti, Sokol Kosta, A. Brizius, Ian T Foster","doi":"10.1145/3150994.3150997","DOIUrl":"https://doi.org/10.1145/3150994.3150997","url":null,"abstract":"Sensors incorporated into mobile devices provide unique opportunities to capture detailed environmental information that cannot be readily collected in other ways. We show here how data from networked navigational sensors on leisure vessels can be used to construct unique new datasets, using the example of underwater topography (bathymetry) to demonstrate the approach. Specifically, we describe an end-to-end workflow that involves the collection of large numbers of timestamped (position, depth) measurements from \"internet of floating things\" devices on leisure vessels; the communication of data to cloud resources, via a specialized protocol capable of dealing with delayed, intermittent, or even disconnected networks; the integration of measurement data into cloud storage; the efficient correction and interpolation of measurements on a cloud computing platform; and the creation of a continuously updated bathymetric database. Our prototype implementation of this workflow leverages the FACE-IT Galaxy workflow engine to integrate network communication and database components with a CUDA-enabled algorithm running in a virtualized cloud environment.","PeriodicalId":228111,"journal":{"name":"Proceedings of the 12th Workshop on Workflows in Support of Large-Scale Science","volume":"91 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2017-11-12","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"126753840","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Supporting task-level fault-tolerance in HPC workflows by launching MPI jobs inside MPI jobs","authors":"Matthieu Dorier, J. Wozniak, R. Ross","doi":"10.1145/3150994.3151001","DOIUrl":"https://doi.org/10.1145/3150994.3151001","url":null,"abstract":"While the use of workflows for HPC is growing, MPI interoperability remains a challenge for workflow management systems. The MPI standard and/or its implementations provide a number of ways to build multiple-programs-multiple-data (MPMD) applications. These methods present limitations related to fault tolerance, and are not easy to use. In this paper, we advocate for a novel MPI_Comm_launch function acting as the parallel counterpart of a system(3) call. MPI_Comm_launch allows a child MPI application to be launched inside the resources originally held by processes of a parent MPI application. Two important aspects of MPI_Comm_launch is that it pauses the calling process, and runs the child processes on the parent's CPU cores, but in an isolated manner with respect to memory. This function makes it easier to build MPMD applications with well-decoupled subtasks. We show how this feature can provide better flexibility and better fault tolerance in ensemble simulations and HPC workflows. We report results showing 2x throughput improvement for application workflows with faults, and scaling results for challenging workloads up to 256 nodes.","PeriodicalId":228111,"journal":{"name":"Proceedings of the 12th Workshop on Workflows in Support of Large-Scale Science","volume":"32 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2017-11-12","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"123216517","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
S. Gesing, M. Atkinson, I. Klampanos, Michelle Galea, M. Berthold, R. Barbera, Diego Scardaci, G. Terstyánszky, T. Kiss, P. Kacsuk
{"title":"Proceedings of the 12th Workshop on Workflows in Support of Large-Scale Science","authors":"S. Gesing, M. Atkinson, I. Klampanos, Michelle Galea, M. Berthold, R. Barbera, Diego Scardaci, G. Terstyánszky, T. Kiss, P. Kacsuk","doi":"10.1145/2534248.2534260","DOIUrl":"https://doi.org/10.1145/2534248.2534260","url":null,"abstract":"Scientific workflows are routinely used in most scientific disciplines today, as they provide a systematic way to execute a number of applications in Science and Engineering. They are at the interface of end-users and computing infrastructures, often relying on workflow management systems and a variety of parallel and/or distributed computing resources. In addition, with the drastic increase of raw data volume in many domains, their role is important to assist scientists in organizing and processing their data and leverage High-Performance or High-Throughput computing resources","PeriodicalId":228111,"journal":{"name":"Proceedings of the 12th Workshop on Workflows in Support of Large-Scale Science","volume":"9 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2007-06-25","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"123669445","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}