{"title":"Welcome Message from the CCGRID 2019 Workshop/Tutorial Chairs","authors":"G. Pallis, A. Toosi, B. Sotomayor","doi":"10.1109/ccgrid.2019.00007","DOIUrl":"https://doi.org/10.1109/ccgrid.2019.00007","url":null,"abstract":"Welcome to the 19th Annual IEEE/ACM International Symposium in Cluster, Cloud, and Grid Computing (CCGrid 2019)! At this year’s CCGRID, the conference’s main program is supplemented by a diverse offering of workshops and tutorials that will allow you to explore a variety of topics in greater depth. On Wednesday, May 14th, we are hosting seven workshops that will include paper presentations, keynotes, invited talks, and discussions:","PeriodicalId":234571,"journal":{"name":"2019 19th IEEE/ACM International Symposium on Cluster, Cloud and Grid Computing (CCGRID)","volume":"58 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2019-05-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"134311528","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"[Publisher's information]","authors":"","doi":"10.1109/ccgrid.2019.00091","DOIUrl":"https://doi.org/10.1109/ccgrid.2019.00091","url":null,"abstract":"","PeriodicalId":234571,"journal":{"name":"2019 19th IEEE/ACM International Symposium on Cluster, Cloud and Grid Computing (CCGRID)","volume":"8 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2019-05-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"117060408","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Proximity-Aware Traffic Routing in Distributed Fog Computing Platforms","authors":"Alice Fahs, G. Pierre","doi":"10.1109/CCGRID.2019.00062","DOIUrl":"https://doi.org/10.1109/CCGRID.2019.00062","url":null,"abstract":"Container orchestration engines such as Kubernetes do not take into account the geographical location of application replicas when deciding which replica should handle which request. This makes them ill-suited to act as a general-purpose fog computing platforms where the proximity between end users and the replica serving them is essential. We present proxy-mity, a proximity-aware traffic routing system for distributed fog computing platforms. It seamlessly integrates in Kubernetes, and provides very simple control mechanisms to allow system administrators to address the necessary trade-off between reducing the user-to-replica latencies and balancing the load equally across replicas. proxy-mity is very lightweight and it can reduce average user-to-replica latencies by as much as 90% while allowing the system administrators to control the level of load imbalance in their system.","PeriodicalId":234571,"journal":{"name":"2019 19th IEEE/ACM International Symposium on Cluster, Cloud and Grid Computing (CCGRID)","volume":"12 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2019-05-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"122207408","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Mukhtiar Bano, Umar Ahmad Qureshi, R. N. B. Rais, M. Tufail, A. Qayyum
{"title":"Miracle: An Agile Colocation Platform for Enabling XaaS Cloud Architecture","authors":"Mukhtiar Bano, Umar Ahmad Qureshi, R. N. B. Rais, M. Tufail, A. Qayyum","doi":"10.1109/CCGRID.2019.00078","DOIUrl":"https://doi.org/10.1109/CCGRID.2019.00078","url":null,"abstract":"Cloud Service Providers (CSPs) offer services to large scale business organizations using different cloud frameworks including Platform as a Service (PaaS), Software as a Service (SaaS), Infrastructure as a Service (IaaS) and follow various business models to realize profitability while addressing the challenges regarding scalability, guaranteed connectivity and security. In order to address such issues, Colocation service providers (third-party organizations) lease physical infrastructure services in geographically distributed locations known as colocation points (Co-Lo). Organization eventually request services available within a given colocation facility. Such a solution is challenging in a way that not all service offerings are available on a given Co-Lo provider. The proposed platform 'Miracle' addresses aforementioned challenges by providing agility and cost reduction to organizations aiming to leverage colocation services. Our solution leverages Software Defined Networking along with service and performance awareness technology to virtualize the organization's presence in a Co-Lo. Hence, organization can be associated or re-associated on the fly to required Co-Lo according to its service connectivity needs at a given time. Our solution may benefit cloud services industry and their customers while adopting the open-source computing technologies such as virtualization, dynamic provisioning environment, and multi-tenancy resulting in offering increased efficiency in a profitable and cost effective manner.","PeriodicalId":234571,"journal":{"name":"2019 19th IEEE/ACM International Symposium on Cluster, Cloud and Grid Computing (CCGRID)","volume":"1 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2019-05-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"130683982","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"A Performance Driven Micro Services-Based Architecture/System for Analyzing Noisy IoT Data","authors":"M. Bolic, S. Majumdar","doi":"10.1109/CCGRID.2019.00031","DOIUrl":"https://doi.org/10.1109/CCGRID.2019.00031","url":null,"abstract":"The Internet of Things (IoT) technology presents a complex and challenging paradigm where a huge amount of noisy raw sensor data is collected in order to observe and detect critical events occurring on the system, and generate alarms when required. The biggest challenge of the IoT systems is that the systems collect a massive amount of uncertain data from diverse IoT devices connected through the network. In addition, some events are inferred from other events and uncertainty is propagated from parent events to the inferred events, which additionally contributes to overall system uncertainty. The observed complex events are a complex relationship of primitive events that are produced by IoT devices and collected in IoT systems. A survey performed on existing prior arts on quantifying uncertainty for complex events concluded that proposed existing solutions are unable to scale under heavy loads of incoming data. This paper presents a micro-service based notification methodology that uses complex event recognition (both complex event processing and probabilistic programming) to handle IoT systems uncertainty. In addition, the paper analyzes and recommends existing big data platforms for processing complex events in IoT systems. The current focus of our work includes research and development of the optimized deadline-based and cost-effective resource allocation algorithm in Apache Spark for Uncertain IoT Notification systems.","PeriodicalId":234571,"journal":{"name":"2019 19th IEEE/ACM International Symposium on Cluster, Cloud and Grid Computing (CCGRID)","volume":" 24","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2019-05-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"113952839","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Scalable Video Transcoding in Public Clouds","authors":"Qingye Jiang, Young Choon Lee, Albert Y. Zomaya","doi":"10.1109/CCGRID.2019.00017","DOIUrl":"https://doi.org/10.1109/CCGRID.2019.00017","url":null,"abstract":"In this paper, we present the challenges involved in large-scale video transcoding application in public clouds. We introduce the architecture of an existing video transcoding system which is tightly coupled with an existing video sharing service. We examine the horizontal scalability of the video transcoding system on AWS EC2. With an online transaction processing (OLTP) model, the system achieves linear horizontal scalability up to 1,000 vCPU cores, but starts to experience performance degradation beyond that. We analyze the resource consumption pattern of the existing system, then introduce an improved architecture by adding a message queue layer. This effectively decouples the video transcoding system from the video sharing service and converts the OLTP model into a batch processing model. Large-scale evaluations on AWS EC2 indicate that the improved design maintains linear horizontal scalability at 10,100 vCPU cores. The hybrid design of the system allows it to be easily adapted for other batch processing use cases without the need to modify or recompile the application.","PeriodicalId":234571,"journal":{"name":"2019 19th IEEE/ACM International Symposium on Cluster, Cloud and Grid Computing (CCGRID)","volume":"408 23","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2019-05-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"114008055","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Adrián Castelló, M. F. Dolz, E. S. Quintana‐Ortí, J. Duato
{"title":"Theoretical Scalability Analysis of Distributed Deep Convolutional Neural Networks","authors":"Adrián Castelló, M. F. Dolz, E. S. Quintana‐Ortí, J. Duato","doi":"10.1109/CCGRID.2019.00068","DOIUrl":"https://doi.org/10.1109/CCGRID.2019.00068","url":null,"abstract":"We analyze the asymptotic performance of the training process of deep neural networks (NN) on clusters in order to determine the scalability. For this purpose, i) we assume a data parallel implementation of the training algorithm, which distributes the batches among the cluster nodes and replicates the model; ii) we leverage the roofline model to inspect the performance at the node level, taking into account the floating-point unit throughput and memory bandwidth; and iii) we consider distinct collective communication schemes that are optimal depending on the message size and underlying network interconnection topology. We then apply the resulting performance model to analyze the scalability of several well-known deep convolutional neural networks as a function of the batch size, node floating-point throughput, node memory bandwidth, cluster dimension, and link bandwidth.","PeriodicalId":234571,"journal":{"name":"2019 19th IEEE/ACM International Symposium on Cluster, Cloud and Grid Computing (CCGRID)","volume":"16 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2019-05-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"121525513","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Scalability of the NewMadeleine Communication Library for Large Numbers of MPI Point-to-Point Requests","authors":"Alexandre Denis","doi":"10.1109/CCGRID.2019.00051","DOIUrl":"https://doi.org/10.1109/CCGRID.2019.00051","url":null,"abstract":"New kinds of applications with lots of threads or irregular communication patterns which rely a lot on point-to-point MPI communications have emerged. It stresses the MPI library with potentially a lot of simultaneous MPI requests for sending and receiving at the same time. To deal with large numbers of simultaneous requests, the bottleneck lies in two main mechanisms: the tag-matching (the algorithm that matches an incoming packet with a posted receive request), and the progression engine. In this paper, we propose algorithms and implementations that overcome these issues so as to scale up to thousands of requests if needed. In particular our algorithms are able to perform constant-time tag-matching even with any-source and any-tag support. We have implemented these mechanisms in our NewMadeleine communication library. Through micro-benchmarks and computation kernel benchmarks, we demonstrate that our MPI library exhibits better performance than state-of-the-art MPI implementations in cases with many simultaneous requests.","PeriodicalId":234571,"journal":{"name":"2019 19th IEEE/ACM International Symposium on Cluster, Cloud and Grid Computing (CCGRID)","volume":"35 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2019-05-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"126090065","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Application-Level Differential Checkpointing for HPC Applications with Dynamic Datasets","authors":"Kai Keller, L. Bautista-Gomez","doi":"10.1109/CCGRID.2019.00015","DOIUrl":"https://doi.org/10.1109/CCGRID.2019.00015","url":null,"abstract":"High-performance computing (HPC) requires resilience techniques such as checkpointing in order to tolerate failures in supercomputers. As the number of nodes and memory in supercomputers keeps on increasing, the size of checkpoint data also increases dramatically, sometimes causing an I/O bottleneck. Differential checkpointing (dCP) aims to minimize the checkpointing overhead by only writing data differences. This is typically implemented at the memory page level, sometimes complemented with hashing algorithms. However, such a technique is unable to cope with dynamic-size datasets. In this work, we present a novel dCP implementation with a new file format that allows fragmentation of protected datasets in order to support dynamic sizes. We identify dirty data blocks using hash algorithms. In order to evaluate the dCP performance, we ported the HPC applications xPic, LULESH 2.0 and Heat2D and analyze them regarding their potential of reducing I/O with dCP and how this data reduction influences the checkpoint performance. In our experiments, we achieve reductions of up to 62% of the checkpoint time.","PeriodicalId":234571,"journal":{"name":"2019 19th IEEE/ACM International Symposium on Cluster, Cloud and Grid Computing (CCGRID)","volume":"32 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2019-05-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"133529663","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Welcome from the General Chairs","authors":"","doi":"10.1109/ccgrid.2019.00005","DOIUrl":"https://doi.org/10.1109/ccgrid.2019.00005","url":null,"abstract":"","PeriodicalId":234571,"journal":{"name":"2019 19th IEEE/ACM International Symposium on Cluster, Cloud and Grid Computing (CCGRID)","volume":"59 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2019-05-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"123417550","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}