2019 19th IEEE/ACM International Symposium on Cluster, Cloud and Grid Computing (CCGRID)最新文献_第8页

Towards a Science Gateway for Bioinformatics: Experiences in the Brazilian System of High Performance Computing 迈向生物信息学的科学门户:巴西高性能计算系统的经验

2019 19th IEEE/ACM International Symposium on Cluster, Cloud and Grid Computing (CCGRID) Pub Date : 2019-05-01 DOI: 10.1109/CCGRID.2019.00082

Kary Ann del Carmen Ocaña Gautherot, Marcelo Galheigo, Carla Osthoff, Luiz M. R. Gadelha, A. A. Gomes, Daniel de Oliveira, F. Porto, A. T. Vasconcelos

{"title":"Towards a Science Gateway for Bioinformatics: Experiences in the Brazilian System of High Performance Computing","authors":"Kary Ann del Carmen Ocaña Gautherot, Marcelo Galheigo, Carla Osthoff, Luiz M. R. Gadelha, A. A. Gomes, Daniel de Oliveira, F. Porto, A. T. Vasconcelos","doi":"10.1109/CCGRID.2019.00082","DOIUrl":"https://doi.org/10.1109/CCGRID.2019.00082","url":null,"abstract":"Science gateways bring out the possibility of reproducible science as they are integrated into reusable techniques, data and workflow management systems, security mechanisms, and high performance computing (HPC). We introduce BioinfoPortal, a science gateway that integrates a suite of different bioinformatics applications using HPC and data management resources provided by the Brazilian National HPC System (SINAPAD). BioinfoPortal follows the Software as a Service (SaaS) model and the web server is freely available for academic use. The goal of this paper is to describe the science gateway and its usage, addressing challenges of designing a multiuser computational platform for parallel/distributed executions of large-scale bioinformatics applications using the Brazilian HPC resources. We also present a study of performance and scalability of some bioinformatics applications executed in the HPC environments and perform machine learning analyses for predicting features for the HPC allocation/usage that could better perform the bioinformatics applications via BioinfoPortal","PeriodicalId":234571,"journal":{"name":"2019 19th IEEE/ACM International Symposium on Cluster, Cloud and Grid Computing (CCGRID)","volume":"22 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2019-05-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"122453641","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 4

Reproducibility and Performance of Deep Learning Applications for Cancer Detection in Pathological Images 病理图像中癌症检测的深度学习应用的再现性和性能

2019 19th IEEE/ACM International Symposium on Cluster, Cloud and Grid Computing (CCGRID) Pub Date : 2019-05-01 DOI: 10.1109/CCGRID.2019.00080

Christoph Jansen, Bruno Schilling, K. Strohmenger, Michael Witt, Jonas Annuscheit, D. Krefting

{"title":"Reproducibility and Performance of Deep Learning Applications for Cancer Detection in Pathological Images","authors":"Christoph Jansen, Bruno Schilling, K. Strohmenger, Michael Witt, Jonas Annuscheit, D. Krefting","doi":"10.1109/CCGRID.2019.00080","DOIUrl":"https://doi.org/10.1109/CCGRID.2019.00080","url":null,"abstract":"Convolutional Neural Networks (CNN) are used for automatic cancer detection in pathological images. These data-driven experiments are difficult to reproduce, because the CNNs may require CUDA-enabled Nvidia GPUs for acceleration and training is often performed on a large dataset stored on a researcher's computer, inaccessible to others. We introduce the RED file format for reproducible experiment description, where executable programs are packaged and referenced as Docker container images. Data inputs and outputs are described as network resources using standard transmission and authentication protocols instead of local file paths. Following the FAIR guiding principles, the RED format is based on and compatible with the established Common Workflow Language specification. RED files are interpreted by the accompanying Curious Containers (CC) software. Arbitrarily large datasets are mounted inside containers via FUSE network filesystems like SSHFS. SSHFS is compared to NFS and a local SSD in artificial benchmarks and in the context of a CNN training scenario, where SSHFS introduces a performance decrease by a factor of 1.8. We are convinced that RED can greatly improve the reproducibility of deep learning workloads and data-driven experiments. This is in particular important in clinical scenarios where the result of an analysis may contribute to a patient's treatment.","PeriodicalId":234571,"journal":{"name":"2019 19th IEEE/ACM International Symposium on Cluster, Cloud and Grid Computing (CCGRID)","volume":"114 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2019-05-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"117205290","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 3

Fuzzy Matching: Hardware Accelerated MPI Communication Middleware 模糊匹配:硬件加速的MPI通信中间件

2019 19th IEEE/ACM International Symposium on Cluster, Cloud and Grid Computing (CCGRID) Pub Date : 2019-05-01 DOI: 10.1109/CCGRID.2019.00035

Matthew G. F. Dosanjh, W. Schonbein, Ryan E. Grant, Patrick G. Bridges, S. Mahdieh Gazimirsaeed, A. Afsahi

{"title":"Fuzzy Matching: Hardware Accelerated MPI Communication Middleware","authors":"Matthew G. F. Dosanjh, W. Schonbein, Ryan E. Grant, Patrick G. Bridges, S. Mahdieh Gazimirsaeed, A. Afsahi","doi":"10.1109/CCGRID.2019.00035","DOIUrl":"https://doi.org/10.1109/CCGRID.2019.00035","url":null,"abstract":"Contemporary parallel scientific codes often rely on message passing for inter-process communication. However, inefficient coding practices or multithreading (e.g., via MPI_THREAD_MULTIPLE) can severely stress the underlying message processing infrastructure, resulting in potentially un-acceptable impacts on application performance. In this article, we propose and evaluate a novel method for addressing this issue: 'Fuzzy Matching'. This approach has two components. First, it exploits the fact most server-class CPUs include vector operations to parallelize message matching. Second, based on a survey of point-to-point communication patterns in representative scientific applications, the method further increases parallelization by allowing matches based on 'partial truth', i.e., by identifying probable rather than exact matches. We evaluate the impact of this approach on memory usage and performance on Knight's Landing and Skylake processors. At scale (262,144 Intel Xeon Phi cores), the method shows up to 1.13 GiB of memory savings per node in the MPI library, and improvement in matching time of 95.9%; smaller-scale runs show run-time improvements of up to 31.0% for full applications, and up to 6.1% for optimized proxy applications.","PeriodicalId":234571,"journal":{"name":"2019 19th IEEE/ACM International Symposium on Cluster, Cloud and Grid Computing (CCGRID)","volume":"55 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2019-05-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"133057760","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 11

Distributed Operator Placement for IoT Data Analytics Across Edge and Cloud Resources 跨边缘和云资源的物联网数据分析的分布式运营商布局

2019 19th IEEE/ACM International Symposium on Cluster, Cloud and Grid Computing (CCGRID) Pub Date : 2019-05-01 DOI: 10.1109/CCGRID.2019.00060

E. G. Renart, A. Veith, Daniel Balouek-Thomert, M. Assunção, L. Lefèvre, M. Parashar

{"title":"Distributed Operator Placement for IoT Data Analytics Across Edge and Cloud Resources","authors":"E. G. Renart, A. Veith, Daniel Balouek-Thomert, M. Assunção, L. Lefèvre, M. Parashar","doi":"10.1109/CCGRID.2019.00060","DOIUrl":"https://doi.org/10.1109/CCGRID.2019.00060","url":null,"abstract":"The number of Internet of Things applications is forecast to grow exponentially within the coming decade. Owners of such applications strive to make predictions from large streams of complex input in near real time. Cloud-based architectures often centralize storage and processing, generating high data movement overheads that penalize real-time applications. Edge and Cloud architecture pushes computation closer to where the data is generated, reducing the cost of data movements and improving the application response time. The heterogeneity among the edge devices and cloud servers introduces an important challenge for deciding how to split and orchestrate the IoT applications across the edge and the cloud. In this paper, we extend our IoT Edge Framework, called R-Pulsar, to propose a solution on how to split IoT applications dynamically across the edge and the cloud, allowing us to improve performance metrics such as end-to-end latency (response time), bandwidth consumption, and edge-to-cloud and cloud-to-edge messaging cost. Our approach consists of a programming model and real-world implementation of an IoT application. The results show that our approach can minimize the end-to-end latency by at least 38% by pushing part of the IoT application to the edge. Meanwhile, the edge-to-cloud data transfers are reduced by at least 38% and the messaging costs are reduced by at least 50% when using the existing commercial edge cloud cost models.","PeriodicalId":234571,"journal":{"name":"2019 19th IEEE/ACM International Symposium on Cluster, Cloud and Grid Computing (CCGRID)","volume":"429 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2019-05-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"126090659","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 20

Anomaly Detection and Classification using Distributed Tracing and Deep Learning 基于分布式跟踪和深度学习的异常检测和分类

2019 19th IEEE/ACM International Symposium on Cluster, Cloud and Grid Computing (CCGRID) Pub Date : 2019-05-01 DOI: 10.1109/CCGRID.2019.00038

S. Nedelkoski, Jorge Cardoso, O. Kao

{"title":"Anomaly Detection and Classification using Distributed Tracing and Deep Learning","authors":"S. Nedelkoski, Jorge Cardoso, O. Kao","doi":"10.1109/CCGRID.2019.00038","DOIUrl":"https://doi.org/10.1109/CCGRID.2019.00038","url":null,"abstract":"Artificial Intelligence for IT Operations (AIOps) combines big data and machine learning to replace a broad range of IT Operations tasks including availability, performance, and monitoring of services. By exploiting log, tracing, metric, and network data, AIOps enable detection of faults and issues of services. The focus of this work is on detecting anomalies based on distributed tracing records that contain detailed information for the availability and the response time of the services. In large-scale distributed systems, where a service is deployed on heterogeneous hardware and has multiple scenarios of normal operation, it becomes challenging to detect such anomalous cases. We address the problem by proposing unsupervised, response time anomaly detection based on deep learning data modeling techniques; unsupervised dynamic error threshold approach; tolerance module for false positive reduction; and descriptive classification of the anomalies. The evaluation shows that the approach achieves high accuracy and solid performance in both, experimental testbed and large-scale production cloud.","PeriodicalId":234571,"journal":{"name":"2019 19th IEEE/ACM International Symposium on Cluster, Cloud and Grid Computing (CCGRID)","volume":"1 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2019-05-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"129735180","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 40

Adaptive Quality Optimization of Computer Vision Tasks in Resource-Constrained Devices using Edge Computing 基于边缘计算的资源受限设备中计算机视觉任务自适应质量优化

2019 19th IEEE/ACM International Symposium on Cluster, Cloud and Grid Computing (CCGRID) Pub Date : 2019-05-01 DOI: 10.1109/CCGRID.2019.00061

Anas Toma, Juri Wenner, J. E. Lenssen, Jian-Jia Chen

{"title":"Adaptive Quality Optimization of Computer Vision Tasks in Resource-Constrained Devices using Edge Computing","authors":"Anas Toma, Juri Wenner, J. E. Lenssen, Jian-Jia Chen","doi":"10.1109/CCGRID.2019.00061","DOIUrl":"https://doi.org/10.1109/CCGRID.2019.00061","url":null,"abstract":"This paper presents an approach to optimize the quality of computer vision tasks in resource-constrained devices by using different execution versions of the same task. The execution versions are generated by dropping irrelevant contents of the input images or other contents that have marginal effect on the quality of the result. Our execution model is designed to support the edge computing paradigm, where the tasks can be executed remotely on edge nodes either to improve the quality or to reduce the workload of the local device. We also propose an algorithm that selects the suitable execution versions, which includes selecting the configuration and the location of the execution, in order to maximize the total quality of the tasks based on the available resources. The proposed approach provides reliable and adaptive task execution by using several execution versions with various performance and quality trade-offs. Therefore, it is very beneficial for systems with resource and timing constraints such as portable medical devices, surveillance video cameras, wearable systems, etc. The proposed algorithm is evaluated using different computer vision benchmarks.","PeriodicalId":234571,"journal":{"name":"2019 19th IEEE/ACM International Symposium on Cluster, Cloud and Grid Computing (CCGRID)","volume":"24 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2019-05-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"114076309","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 7

CCGrid 2019 Committees CCGrid 2019委员会

2019 19th IEEE/ACM International Symposium on Cluster, Cloud and Grid Computing (CCGRID) Pub Date : 2019-05-01 DOI: 10.1109/ccgrid.2019.00008

引用次数: 0

Welcome from the Program Chairs 欢迎各位节目主持人

2019 19th IEEE/ACM International Symposium on Cluster, Cloud and Grid Computing (CCGRID) Pub Date : 2019-05-01 DOI: 10.1109/ccgrid.2019.00006

Carlos Castillo, Donald Metzler

引用次数: 0

Optimized Memory Management for a Java-Based Distributed In-memory System 基于java的分布式内存系统的优化内存管理

2019 19th IEEE/ACM International Symposium on Cluster, Cloud and Grid Computing (CCGRID) Pub Date : 2019-05-01 DOI: 10.1109/CCGRID.2019.00086

Stefan Nothaas, Kevin Beineke, M. Schöttner

{"title":"Optimized Memory Management for a Java-Based Distributed In-memory System","authors":"Stefan Nothaas, Kevin Beineke, M. Schöttner","doi":"10.1109/CCGRID.2019.00086","DOIUrl":"https://doi.org/10.1109/CCGRID.2019.00086","url":null,"abstract":"Several Java-based distributed in-memory systems have been proposed in the literature, but most are not aiming at graph applications having highly concurrent and irregular access patterns to many small data objects. DXRAM is addressing these challenges and relies on DXMem for memory management and concurrency control on each server. DXMem is published as an open-source library, which can be used by any other system, too. In this paper, we briefly describe our previously published but relevant design aspects of the memory management. However, the main contributions of this paper are the new extensions, optimizations, and evaluations. These contributions include an improved address translation which is now faster compared to the old solution with a translation cache. The coarse-grained concurrency control of our first approach has been replaced by a very efficient per-object read-write lock which allows a much better throughput, especially under high concurrency. Finally, we compared DXRAM for the first time to Hazelcast and Infinispan, two state-of-the-art Java-based distributed cache systems using real-world application-workloads and the Yahoo! Cloud Serving Benchmark in a distributed environment. The results of the experiments show that DXRAM outperforms both systems while having a much lower metadata overhead for many small data objects.","PeriodicalId":234571,"journal":{"name":"2019 19th IEEE/ACM International Symposium on Cluster, Cloud and Grid Computing (CCGRID)","volume":"1 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2019-05-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"128731742","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 1

NBBS: A Non-Blocking Buddy System for Multi-core Machines NBBS:用于多核机器的非阻塞伙伴系统

2019 19th IEEE/ACM International Symposium on Cluster, Cloud and Grid Computing (CCGRID) Pub Date : 2019-05-01 DOI: 10.1109/CCGRID.2019.00011

Romolo Marotta, Mauro Ianni, Andrea Scarselli, Alessandro Pellegrini, F. Quaglia

{"title":"NBBS: A Non-Blocking Buddy System for Multi-core Machines","authors":"Romolo Marotta, Mauro Ianni, Andrea Scarselli, Alessandro Pellegrini, F. Quaglia","doi":"10.1109/CCGRID.2019.00011","DOIUrl":"https://doi.org/10.1109/CCGRID.2019.00011","url":null,"abstract":"Common implementations of core memory allocation components, like the Linux buddy system, handle concurrent allocation/release requests by synchronizing threads via spin-locks. This approach is not prone to scale, a problem that has been addressed in the literature by introducing layered allocation services or replicating the core allocators—the bottom most ones within the layered architecture. Both these solutions tend to reduce the pressure of actual concurrent accesses to each individual core allocator. In this article we explore an alternative approach to scalability of memory allocation/release, which can be still combined with those literature proposals. We present a fully non-blocking buddy-system, where threads performing concurrent allocations/releases do not undergo any spin-lock based synchronization. Our solution allows threads to proceed in parallel, and commit their allocations/releases unless a conflict is materialized while handling the allocator metadata. Conflict detection relies on atomic Read-Modify-Write (RMW) machine instructions. Beyond improving scalability and performance, our solution can also avoid wasting clock cycles for spin-lock operations by threads that could in principle carry out their memory allocations/releases in full concurrency.","PeriodicalId":234571,"journal":{"name":"2019 19th IEEE/ACM International Symposium on Cluster, Cloud and Grid Computing (CCGRID)","volume":"119 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2019-05-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"133240174","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 1