Kary Ann del Carmen Ocaña Gautherot, Marcelo Galheigo, Carla Osthoff, Luiz M. R. Gadelha, A. A. Gomes, Daniel de Oliveira, F. Porto, A. T. Vasconcelos
{"title":"Towards a Science Gateway for Bioinformatics: Experiences in the Brazilian System of High Performance Computing","authors":"Kary Ann del Carmen Ocaña Gautherot, Marcelo Galheigo, Carla Osthoff, Luiz M. R. Gadelha, A. A. Gomes, Daniel de Oliveira, F. Porto, A. T. Vasconcelos","doi":"10.1109/CCGRID.2019.00082","DOIUrl":"https://doi.org/10.1109/CCGRID.2019.00082","url":null,"abstract":"Science gateways bring out the possibility of reproducible science as they are integrated into reusable techniques, data and workflow management systems, security mechanisms, and high performance computing (HPC). We introduce BioinfoPortal, a science gateway that integrates a suite of different bioinformatics applications using HPC and data management resources provided by the Brazilian National HPC System (SINAPAD). BioinfoPortal follows the Software as a Service (SaaS) model and the web server is freely available for academic use. The goal of this paper is to describe the science gateway and its usage, addressing challenges of designing a multiuser computational platform for parallel/distributed executions of large-scale bioinformatics applications using the Brazilian HPC resources. We also present a study of performance and scalability of some bioinformatics applications executed in the HPC environments and perform machine learning analyses for predicting features for the HPC allocation/usage that could better perform the bioinformatics applications via BioinfoPortal","PeriodicalId":234571,"journal":{"name":"2019 19th IEEE/ACM International Symposium on Cluster, Cloud and Grid Computing (CCGRID)","volume":"22 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2019-05-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"122453641","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Christoph Jansen, Bruno Schilling, K. Strohmenger, Michael Witt, Jonas Annuscheit, D. Krefting
{"title":"Reproducibility and Performance of Deep Learning Applications for Cancer Detection in Pathological Images","authors":"Christoph Jansen, Bruno Schilling, K. Strohmenger, Michael Witt, Jonas Annuscheit, D. Krefting","doi":"10.1109/CCGRID.2019.00080","DOIUrl":"https://doi.org/10.1109/CCGRID.2019.00080","url":null,"abstract":"Convolutional Neural Networks (CNN) are used for automatic cancer detection in pathological images. These data-driven experiments are difficult to reproduce, because the CNNs may require CUDA-enabled Nvidia GPUs for acceleration and training is often performed on a large dataset stored on a researcher's computer, inaccessible to others. We introduce the RED file format for reproducible experiment description, where executable programs are packaged and referenced as Docker container images. Data inputs and outputs are described as network resources using standard transmission and authentication protocols instead of local file paths. Following the FAIR guiding principles, the RED format is based on and compatible with the established Common Workflow Language specification. RED files are interpreted by the accompanying Curious Containers (CC) software. Arbitrarily large datasets are mounted inside containers via FUSE network filesystems like SSHFS. SSHFS is compared to NFS and a local SSD in artificial benchmarks and in the context of a CNN training scenario, where SSHFS introduces a performance decrease by a factor of 1.8. We are convinced that RED can greatly improve the reproducibility of deep learning workloads and data-driven experiments. This is in particular important in clinical scenarios where the result of an analysis may contribute to a patient's treatment.","PeriodicalId":234571,"journal":{"name":"2019 19th IEEE/ACM International Symposium on Cluster, Cloud and Grid Computing (CCGRID)","volume":"114 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2019-05-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"117205290","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Matthew G. F. Dosanjh, W. Schonbein, Ryan E. Grant, Patrick G. Bridges, S. Mahdieh Gazimirsaeed, A. Afsahi
{"title":"Fuzzy Matching: Hardware Accelerated MPI Communication Middleware","authors":"Matthew G. F. Dosanjh, W. Schonbein, Ryan E. Grant, Patrick G. Bridges, S. Mahdieh Gazimirsaeed, A. Afsahi","doi":"10.1109/CCGRID.2019.00035","DOIUrl":"https://doi.org/10.1109/CCGRID.2019.00035","url":null,"abstract":"Contemporary parallel scientific codes often rely on message passing for inter-process communication. However, inefficient coding practices or multithreading (e.g., via MPI_THREAD_MULTIPLE) can severely stress the underlying message processing infrastructure, resulting in potentially un-acceptable impacts on application performance. In this article, we propose and evaluate a novel method for addressing this issue: 'Fuzzy Matching'. This approach has two components. First, it exploits the fact most server-class CPUs include vector operations to parallelize message matching. Second, based on a survey of point-to-point communication patterns in representative scientific applications, the method further increases parallelization by allowing matches based on 'partial truth', i.e., by identifying probable rather than exact matches. We evaluate the impact of this approach on memory usage and performance on Knight's Landing and Skylake processors. At scale (262,144 Intel Xeon Phi cores), the method shows up to 1.13 GiB of memory savings per node in the MPI library, and improvement in matching time of 95.9%; smaller-scale runs show run-time improvements of up to 31.0% for full applications, and up to 6.1% for optimized proxy applications.","PeriodicalId":234571,"journal":{"name":"2019 19th IEEE/ACM International Symposium on Cluster, Cloud and Grid Computing (CCGRID)","volume":"55 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2019-05-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"133057760","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
E. G. Renart, A. Veith, Daniel Balouek-Thomert, M. Assunção, L. Lefèvre, M. Parashar
{"title":"Distributed Operator Placement for IoT Data Analytics Across Edge and Cloud Resources","authors":"E. G. Renart, A. Veith, Daniel Balouek-Thomert, M. Assunção, L. Lefèvre, M. Parashar","doi":"10.1109/CCGRID.2019.00060","DOIUrl":"https://doi.org/10.1109/CCGRID.2019.00060","url":null,"abstract":"The number of Internet of Things applications is forecast to grow exponentially within the coming decade. Owners of such applications strive to make predictions from large streams of complex input in near real time. Cloud-based architectures often centralize storage and processing, generating high data movement overheads that penalize real-time applications. Edge and Cloud architecture pushes computation closer to where the data is generated, reducing the cost of data movements and improving the application response time. The heterogeneity among the edge devices and cloud servers introduces an important challenge for deciding how to split and orchestrate the IoT applications across the edge and the cloud. In this paper, we extend our IoT Edge Framework, called R-Pulsar, to propose a solution on how to split IoT applications dynamically across the edge and the cloud, allowing us to improve performance metrics such as end-to-end latency (response time), bandwidth consumption, and edge-to-cloud and cloud-to-edge messaging cost. Our approach consists of a programming model and real-world implementation of an IoT application. The results show that our approach can minimize the end-to-end latency by at least 38% by pushing part of the IoT application to the edge. Meanwhile, the edge-to-cloud data transfers are reduced by at least 38% and the messaging costs are reduced by at least 50% when using the existing commercial edge cloud cost models.","PeriodicalId":234571,"journal":{"name":"2019 19th IEEE/ACM International Symposium on Cluster, Cloud and Grid Computing (CCGRID)","volume":"429 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2019-05-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"126090659","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Anomaly Detection and Classification using Distributed Tracing and Deep Learning","authors":"S. Nedelkoski, Jorge Cardoso, O. Kao","doi":"10.1109/CCGRID.2019.00038","DOIUrl":"https://doi.org/10.1109/CCGRID.2019.00038","url":null,"abstract":"Artificial Intelligence for IT Operations (AIOps) combines big data and machine learning to replace a broad range of IT Operations tasks including availability, performance, and monitoring of services. By exploiting log, tracing, metric, and network data, AIOps enable detection of faults and issues of services. The focus of this work is on detecting anomalies based on distributed tracing records that contain detailed information for the availability and the response time of the services. In large-scale distributed systems, where a service is deployed on heterogeneous hardware and has multiple scenarios of normal operation, it becomes challenging to detect such anomalous cases. We address the problem by proposing unsupervised, response time anomaly detection based on deep learning data modeling techniques; unsupervised dynamic error threshold approach; tolerance module for false positive reduction; and descriptive classification of the anomalies. The evaluation shows that the approach achieves high accuracy and solid performance in both, experimental testbed and large-scale production cloud.","PeriodicalId":234571,"journal":{"name":"2019 19th IEEE/ACM International Symposium on Cluster, Cloud and Grid Computing (CCGRID)","volume":"1 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2019-05-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"129735180","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Anas Toma, Juri Wenner, J. E. Lenssen, Jian-Jia Chen
{"title":"Adaptive Quality Optimization of Computer Vision Tasks in Resource-Constrained Devices using Edge Computing","authors":"Anas Toma, Juri Wenner, J. E. Lenssen, Jian-Jia Chen","doi":"10.1109/CCGRID.2019.00061","DOIUrl":"https://doi.org/10.1109/CCGRID.2019.00061","url":null,"abstract":"This paper presents an approach to optimize the quality of computer vision tasks in resource-constrained devices by using different execution versions of the same task. The execution versions are generated by dropping irrelevant contents of the input images or other contents that have marginal effect on the quality of the result. Our execution model is designed to support the edge computing paradigm, where the tasks can be executed remotely on edge nodes either to improve the quality or to reduce the workload of the local device. We also propose an algorithm that selects the suitable execution versions, which includes selecting the configuration and the location of the execution, in order to maximize the total quality of the tasks based on the available resources. The proposed approach provides reliable and adaptive task execution by using several execution versions with various performance and quality trade-offs. Therefore, it is very beneficial for systems with resource and timing constraints such as portable medical devices, surveillance video cameras, wearable systems, etc. The proposed algorithm is evaluated using different computer vision benchmarks.","PeriodicalId":234571,"journal":{"name":"2019 19th IEEE/ACM International Symposium on Cluster, Cloud and Grid Computing (CCGRID)","volume":"24 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2019-05-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"114076309","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Welcome from the Program Chairs","authors":"Carlos Castillo, Donald Metzler","doi":"10.1109/ccgrid.2019.00006","DOIUrl":"https://doi.org/10.1109/ccgrid.2019.00006","url":null,"abstract":"As many of you who have previously visited Portland know, this city is a terrific location for our meeting: set between Mt. Hood and the Pacific coast, Portland (PDX to locals) presents a variety of attractions for everybody. Infused with the awesome Pacific Northwest culture of environmental conservation and organic local food sourcing, you will find a lot of wonderful restaurants here, plus a myriad of microbreweries, distilleries and wineries! Don’t forget to visit famous PDX staples like Powell’s Books and the charismatic Voodoo Donuts. Don’t forget your hiking shoes or beachwear if you’re planning to explore the great outdoors like the Columbia Gorge area or the Oregon Coast beaches.","PeriodicalId":234571,"journal":{"name":"2019 19th IEEE/ACM International Symposium on Cluster, Cloud and Grid Computing (CCGRID)","volume":"7 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2019-05-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"124061007","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Optimized Memory Management for a Java-Based Distributed In-memory System","authors":"Stefan Nothaas, Kevin Beineke, M. Schöttner","doi":"10.1109/CCGRID.2019.00086","DOIUrl":"https://doi.org/10.1109/CCGRID.2019.00086","url":null,"abstract":"Several Java-based distributed in-memory systems have been proposed in the literature, but most are not aiming at graph applications having highly concurrent and irregular access patterns to many small data objects. DXRAM is addressing these challenges and relies on DXMem for memory management and concurrency control on each server. DXMem is published as an open-source library, which can be used by any other system, too. In this paper, we briefly describe our previously published but relevant design aspects of the memory management. However, the main contributions of this paper are the new extensions, optimizations, and evaluations. These contributions include an improved address translation which is now faster compared to the old solution with a translation cache. The coarse-grained concurrency control of our first approach has been replaced by a very efficient per-object read-write lock which allows a much better throughput, especially under high concurrency. Finally, we compared DXRAM for the first time to Hazelcast and Infinispan, two state-of-the-art Java-based distributed cache systems using real-world application-workloads and the Yahoo! Cloud Serving Benchmark in a distributed environment. The results of the experiments show that DXRAM outperforms both systems while having a much lower metadata overhead for many small data objects.","PeriodicalId":234571,"journal":{"name":"2019 19th IEEE/ACM International Symposium on Cluster, Cloud and Grid Computing (CCGRID)","volume":"1 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2019-05-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"128731742","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Romolo Marotta, Mauro Ianni, Andrea Scarselli, Alessandro Pellegrini, F. Quaglia
{"title":"NBBS: A Non-Blocking Buddy System for Multi-core Machines","authors":"Romolo Marotta, Mauro Ianni, Andrea Scarselli, Alessandro Pellegrini, F. Quaglia","doi":"10.1109/CCGRID.2019.00011","DOIUrl":"https://doi.org/10.1109/CCGRID.2019.00011","url":null,"abstract":"Common implementations of core memory allocation components, like the Linux buddy system, handle concurrent allocation/release requests by synchronizing threads via spin-locks. This approach is not prone to scale, a problem that has been addressed in the literature by introducing layered allocation services or replicating the core allocators—the bottom most ones within the layered architecture. Both these solutions tend to reduce the pressure of actual concurrent accesses to each individual core allocator. In this article we explore an alternative approach to scalability of memory allocation/release, which can be still combined with those literature proposals. We present a fully non-blocking buddy-system, where threads performing concurrent allocations/releases do not undergo any spin-lock based synchronization. Our solution allows threads to proceed in parallel, and commit their allocations/releases unless a conflict is materialized while handling the allocator metadata. Conflict detection relies on atomic Read-Modify-Write (RMW) machine instructions. Beyond improving scalability and performance, our solution can also avoid wasting clock cycles for spin-lock operations by threads that could in principle carry out their memory allocations/releases in full concurrency.","PeriodicalId":234571,"journal":{"name":"2019 19th IEEE/ACM International Symposium on Cluster, Cloud and Grid Computing (CCGRID)","volume":"119 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2019-05-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"133240174","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}