{"title":"Batch and online anomaly detection for scientific applications in a Kubernetes environment","authors":"S. Hariri, M. C. Kind","doi":"10.1145/3217880.3217883","DOIUrl":"https://doi.org/10.1145/3217880.3217883","url":null,"abstract":"We present a cloud based anomaly detection service framework that uses a containerized Spark cluster and ancillary user interfaces all managed by Kubernetes. The stack of technology put together allows for fast, reliable, resilient and easily scalable service for either batch or streaming data. At the heart of the service, we utilize an improved version of the algorithm Isolation Forest called Extended Isolation Forest for robust and efficient anomaly detection. We showcase the design and a normal workflow of our infrastructure which is ready to deploy on any Kubernetes cluster without extra technical knowledge. With exposed APIs and simple graphical interfaces, users can load any data and detect anomalies on the loaded set or on newly presented data points using a batch or a streaming mode. With the latter, users can subscribe and get notifications on the desired output. Our aim is to develop and apply these techniques to use with scientific data. In particular we are interested in finding anomalous objects within the overwhelming set of images and catalogs produced by current and future astronomical surveys, but that can be easily adopted to other fields.","PeriodicalId":340918,"journal":{"name":"Proceedings of the 9th Workshop on Scientific Cloud Computing","volume":"25 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2018-06-11","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"114350984","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Early Experience Using Amazon Batch for Scientific Workflows","authors":"Kyle M. D. Sweeney, D. Thain","doi":"10.1145/3217880.3217885","DOIUrl":"https://doi.org/10.1145/3217880.3217885","url":null,"abstract":"Recent technological trends have pushed many products and technologies into the cloud, relying less on local computational services, and instead purchasing computation a la carte from cloud service providers. These providers focus more on delivering technologies which are service based rather than throughput based. With the advent of Amazon Batch, a new high throughput service, we wished to see how capable it was for running scientific workflows compared to existing cloud services. To that end, we developed a testing suite which created workflows focusing on increasing shared file sizes, increasing unique file sizes, and increasing number of tasks, and ran the workflows on Amazon Batch plus two other similar configurations for comparison: EC2 workers and Work Queue on EC2. We found that while there is a significant delay in sending jobs to Amazon Batch and running raw EC2 workers, there is little overhead in the actual running of the task, and similar performance to using Work Queue on EC2 when the workflow does not require large input files. Additionally, when performing real a workflow, Batch achieved a speedup over Work Queue workers on EC2 instances of 1.18x.1","PeriodicalId":340918,"journal":{"name":"Proceedings of the 9th Workshop on Scientific Cloud Computing","volume":"119 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2018-06-11","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"132019661","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Illyoung Choi, A. Ponsero, K. Youens-Clark, Matthew Bomhoff, B. Hurwitz, J. Hartman
{"title":"Libra","authors":"Illyoung Choi, A. Ponsero, K. Youens-Clark, Matthew Bomhoff, B. Hurwitz, J. Hartman","doi":"10.1145/3217880.3217882","DOIUrl":"https://doi.org/10.1145/3217880.3217882","url":null,"abstract":"Big-data analytics platforms, such as Hadoop, are appealing for scientific computation because they are ubiquitous, well-supported, and well-understood. Unfortunately, load-balancing is a common challenge of implementing large-scale scientific computing applications on these platforms. In this paper we present the design and implementation of Libra, a Hadoop-based tool for comparative metagenomics (comparing samples of genetic material collected from the environment). We describe the computation that Libra performs and how that computation is implemented using Hadoop tasks, including the techniques used by Libra to ensure that the task workloads are balanced despite nonuniform sample sizes and skewed distributions of genetic material in the samples. On a 10-machine Hadoop cluster Libra can analyze the entire Tara Ocean Viromes of ~4.2 billion reads in fewer than 20 hours.","PeriodicalId":340918,"journal":{"name":"Proceedings of the 9th Workshop on Scientific Cloud Computing","volume":"28 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2018-06-11","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"125034325","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Efficient Integration of Containers into Scientific Workflows","authors":"Kyle M. D. Sweeney, D. Thain","doi":"10.1145/3217880.3217887","DOIUrl":"https://doi.org/10.1145/3217880.3217887","url":null,"abstract":"Containers offer a powerful way to create portability for scientific applications. However yet incorporating them into workflows requires careful consideration, as straightforward approaches can increase network usage and runtime. We identified three issues in this process: container composition, containerizing workers or jobs, and container image translation. To tackle composition, we define data into three types: OS data, Read-Only, andWorking data, and define dynamic and static composition. Using the static composition (creating a single container for each job) leads to massive waste in sending duplicate data over the network. Dynamic composition (sending the data types separately) enables caching on worker nodes. To answer running workers or jobs inside a container, we looked at the costs of running inside of a container. Finally, when using different types of container technologies simultaneously, we found it's better to convert to the target image types before sending the container images, instead of repeating the same conversion at the job nodes, leading to more wasted time.","PeriodicalId":340918,"journal":{"name":"Proceedings of the 9th Workshop on Scientific Cloud Computing","volume":"12 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2018-06-11","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"132626210","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
C. Ulmer, Shyamali Mukherjee, G. Templet, Scott Levy, J. Lofstead, Patrick M. Widener, T. Kordenbrock, Margaret Lawson
{"title":"Faodel","authors":"C. Ulmer, Shyamali Mukherjee, G. Templet, Scott Levy, J. Lofstead, Patrick M. Widener, T. Kordenbrock, Margaret Lawson","doi":"10.1145/3217880.3217888","DOIUrl":"https://doi.org/10.1145/3217880.3217888","url":null,"abstract":"Composition of computational science applications, whether into ad hoc pipelines for analysis of simulation data or into well-defined and repeatable workflows, is becoming commonplace. In order to scale well as projected system and data sizes increase, developers will have to address a number of looming challenges. Increased contention for parallel filesystem bandwidth, accomodating in situ and ex situ processing, and the advent of decentralized programming models will all complicate application composition for next-generation systems. In this paper, we introduce a set of data services, Faodel, which provide scalable data management for workflows and composed applications. Faodel allows workflow components to directly and efficiently exchange data in semantically appropriate forms, rather than those dictated by the storage hierarchy or programming model in use. We describe the architecture of Faodel and present preliminary performance results demonstrating its potential for scalability in workflow scenarios.","PeriodicalId":340918,"journal":{"name":"Proceedings of the 9th Workshop on Scientific Cloud Computing","volume":"42 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2018-06-11","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"121238380","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
John Michael Lowe, Jeremy Fischer, Sanjana Sudarshan, George W. Turner, C. Stewart, David Y. Hancock
{"title":"High Availability on Jetstream: Practices and Lessons Learned","authors":"John Michael Lowe, Jeremy Fischer, Sanjana Sudarshan, George W. Turner, C. Stewart, David Y. Hancock","doi":"10.1145/3217880.3217884","DOIUrl":"https://doi.org/10.1145/3217880.3217884","url":null,"abstract":"Research computing has traditionally used high performance computing (HPC) clusters and has been a service not given to high availability without a doubling of computational and storage capacity. System maintenance such as security patching, firmware updates, and other system upgrades generally meant that the system would be unavailable for the duration of the work unless one has redundant HPC systems and storage. While efforts were often made to limit downtimes, when it became necessary, maintenance windows might be one to two hours or as much as an entire day. As the National Science Foundation (NSF) began funding non-traditional research systems, looking at ways to provide higher availability for researchers became one focus for service providers. One of the design elements of Jetstream was to have geographic dispersion to maximize availability. This was the first step in a number of design elements intended to make Jetstream exceed the NSF's availability requirements. We will examine the design steps employed, the components of the system and how the availability for each was considered in deployment, how maintenance is handled, and the lessons learned from the design and implementation of the Jetstream cloud.","PeriodicalId":340918,"journal":{"name":"Proceedings of the 9th Workshop on Scientific Cloud Computing","volume":"223 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2018-06-11","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"120867985","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Matt Baughman, C. Haas, R. Wolski, Ian T Foster, K. Chard
{"title":"Predicting Amazon Spot Prices with LSTM Networks","authors":"Matt Baughman, C. Haas, R. Wolski, Ian T Foster, K. Chard","doi":"10.1145/3217880.3217881","DOIUrl":"https://doi.org/10.1145/3217880.3217881","url":null,"abstract":"Amazon spot instances provide preemptable computing capacity at a cost that is often significantly lower than comparable on-demand or reserved instances. Spot instances are charged at the current spot price: a fluctuating market price based on supply and demand for spot instance capacity. However, spot instances are inherently volatile, the spot price changes over time, and instances can be revoked by Amazon with as little as two minutes' warning. Given the potential discount---up to 90% in some cases---there has been significant interest in the scientific cloud computing community to leverage spot instances for workloads that are either fault-tolerant or not time-sensitive. However, cost-effective use of spot instances requires accurate prediction of spot prices in the future. We explore here the use of long/short-term memory (LSTM) recurrent neural networks for spot price prediction. We describe our model and compare it against a baseline ARIMA model using historical spot pricing data. Our results show that our LSTM approach can reduce training error by as much as 95%.","PeriodicalId":340918,"journal":{"name":"Proceedings of the 9th Workshop on Scientific Cloud Computing","volume":"24 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2018-06-11","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"134032228","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Partitioning SKA Dataflows for Optimal Graph Execution","authors":"Chen Wu, A. Wicenec, R. Tobar","doi":"10.1145/3217880.3217886","DOIUrl":"https://doi.org/10.1145/3217880.3217886","url":null,"abstract":"Optimizing data-intensive workflow execution is essential to many modern scientific projects such as the Square Kilometre Array (SKA), which will be the largest radio telescope in the world, collecting terabytes of data per second for the next few decades. At the core of the SKA Science Data Processor is the graph execution engine, scheduling tens of thousands of algorithmic components to ingest and transform millions of parallel data chunks in order to solve a series of large-scale inverse problems within the power budget. To tackle this challenge, we have developed the Data Activated Liu Graph Engine (DALiuGE) to manage data processing pipelines for several SKA pathfinder projects. In this paper, we discuss the DALiuGE graph scheduling subsystem. By extending previous studies on graph scheduling and partitioning, we lay the foundation on which we can develop polynomial time optimization methods that minimize both workflow execution time and resource footprint while satisfying resource constraints imposed by individual algorithms. We show preliminary results obtained from three radio astronomy data pipelines.","PeriodicalId":340918,"journal":{"name":"Proceedings of the 9th Workshop on Scientific Cloud Computing","volume":"437 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2018-05-19","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"123579794","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Yogesh L. Simmhan, Gabriel Antoniu, C. Goble, L. Ramakrishnan
{"title":"Proceedings of the 9th Workshop on Scientific Cloud Computing","authors":"Yogesh L. Simmhan, Gabriel Antoniu, C. Goble, L. Ramakrishnan","doi":"10.1145/3217880","DOIUrl":"https://doi.org/10.1145/3217880","url":null,"abstract":"It is our pleasure to welcome you to the 6th Workshop on Scientific Cloud Computing (ScienceCloud). ScienceCloud continues to provide the scientific community with the premier forum for discussing new research, development, and deployment efforts in hosting scientific computing workloads on cloud computing infrastructures. The focus of the workshop is on the use of cloud-based technologies to meet new compute-intensive and data-intensive scientific challenges that are not well served by the current supercomputers, grids and HPC clusters. ScienceCloud provides a unique opportunity for interaction and cross-pollination between researchers and practitioners developing applications, algorithms, software, hardware and networking, emphasizing scientific computing for such cloud platforms. \u0000 \u0000The call for papers attracted submissions from across the world. The program committee reviewed and accepted three of six full paper submissions (50%) and three of four short paper submissions (75%). \u0000 \u0000We are delighted to include a keynote and panel involving leading scientific cloud computing researchers. We encourage attendees to attend these presentations: \u0000Challenges of Running Scientific Workflows in Cloud Environments, Ewa Deelman (Information Sciences Institute, University of Southern California) \u0000Real-time Scientific Data Stream Processing, Manish Parashar (Rutgers, the State University of New Jersey), Doug Thain (University of Notre Dame), Ioan Raicu (Illinois Institute of Technology), Rui Zhang (IBM Research)","PeriodicalId":340918,"journal":{"name":"Proceedings of the 9th Workshop on Scientific Cloud Computing","volume":"22 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2012-06-18","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"133169915","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}