2019 19th IEEE/ACM International Symposium on Cluster, Cloud and Grid Computing (CCGRID)最新文献

筛选
英文 中文
Visual Performance Analysis of Memory Behavior in a Task-Based Runtime on Hybrid Platforms 混合平台上基于任务的运行时内存行为的视觉性能分析
Lucas Leandro Nesi, Samuel Thibault, Luka Stanisic, L. Schnorr
{"title":"Visual Performance Analysis of Memory Behavior in a Task-Based Runtime on Hybrid Platforms","authors":"Lucas Leandro Nesi, Samuel Thibault, Luka Stanisic, L. Schnorr","doi":"10.1109/CCGRID.2019.00025","DOIUrl":"https://doi.org/10.1109/CCGRID.2019.00025","url":null,"abstract":"Programming parallel applications for heterogeneous HPC platforms is much more straightforward when using the task-based programming paradigm. The simplicity exists because a runtime takes care of many activities usually carried out by the application developer, such as task mapping, load balancing, and memory management operations. In this paper, we present a visualization-based performance analysis methodology to investigate the CPU-GPU-Disk memory management of the StarPU runtime, a popular task-based middleware for HPC applications. We detail the design of novel graphical strategies that were fundamental to recognize performance problems in four study cases. We first identify poor management of data handles when GPU memory is saturated, leading to low application performance. Our experiments using the dense tiled-based Cholesky factorization show that our fix leads to performance gains of 66% and better scalability for larger input sizes. In the other three cases, we study scenarios where the main memory is insufficient to store all the application's data, forcing the runtime to store data out-of-core. Using our methodology, we pin-point different behavior among schedulers and how we have identified a crucial problem in the application code regarding initial block placement, which leads to poor performance.","PeriodicalId":234571,"journal":{"name":"2019 19th IEEE/ACM International Symposium on Cluster, Cloud and Grid Computing (CCGRID)","volume":"44 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2019-05-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"116341664","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 12
One Can Only Gain by Replacing EASY Backfilling: A Simple Scheduling Policies Case Study 一个人只能通过替换简单的回填获得:一个简单的调度策略案例研究
Danilo Carastan-Santos, R. Camargo, D. Trystram, Salah Zrigui
{"title":"One Can Only Gain by Replacing EASY Backfilling: A Simple Scheduling Policies Case Study","authors":"Danilo Carastan-Santos, R. Camargo, D. Trystram, Salah Zrigui","doi":"10.1109/CCGRID.2019.00010","DOIUrl":"https://doi.org/10.1109/CCGRID.2019.00010","url":null,"abstract":"High-Performance Computing (HPC) platforms are growing in size and complexity. In order to improve the quality of service of such platforms, researchers are devoting a great amount of effort to devise algorithms and techniques to improve different aspects of performance such as energy consumption, total usage of the platform, and fairness between users. In spite of this, system administrators are always reluctant to deploy state of the art scheduling methods and most of them revert to EASY-backfilling, also known as EASY-FCFS (EASY-First-Come-First-Served). Newer methods frequently are complex and obscure and the simplicity and transparency of EASY are too important to sacrifice. In this work, we used execution logs from five HPC platforms to compare four simple scheduling policies: FCFS, Shortest estimated Processing time First (SPF), Smallest Requested Resources First (SQF), and Smallest estimated Area First (SAF). Using simulations, we performed a thorough analysis of the cumulative results for up to 180 weeks and considered three scheduling objectives: waiting time, slowdown and per-processor slowdown. We also evaluated other effects, such as the relationship between job size and slowdown, the distribution of slowdown values, and the number of backfilled jobs, for each HPC platform and scheduling policy. We conclude that one can only gain by replacing EASY-backfilling with SAF with backfilling, as it offers improvements in performance by up to 80% in the slowdown metric while maintaining the simplicity and the transparency of FCFS. Moreover, SAF reduces the number of jobs with large slowdowns and the inclusion of a simple thresholding mechanism guarantees that no starvation occurs. Finally, we propose SAF as a new benchmark for future scheduling studies.","PeriodicalId":234571,"journal":{"name":"2019 19th IEEE/ACM International Symposium on Cluster, Cloud and Grid Computing (CCGRID)","volume":"36 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2019-05-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"116796828","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 26
[Copyright notice] (版权)
{"title":"[Copyright notice]","authors":"","doi":"10.1109/ccgrid.2019.00003","DOIUrl":"https://doi.org/10.1109/ccgrid.2019.00003","url":null,"abstract":"","PeriodicalId":234571,"journal":{"name":"2019 19th IEEE/ACM International Symposium on Cluster, Cloud and Grid Computing (CCGRID)","volume":"6 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2019-05-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"121233910","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Enabling Large Scale Data Production for OpenDose with GATE on the EGI Infrastructure 在EGI基础设施上使用GATE实现OpenDose的大规模数据生产
M. Chauvin, Gilles Mathieu, S. Camarasu-Pop, Axel Bonnet, M. Bardiès, I. Perseil
{"title":"Enabling Large Scale Data Production for OpenDose with GATE on the EGI Infrastructure","authors":"M. Chauvin, Gilles Mathieu, S. Camarasu-Pop, Axel Bonnet, M. Bardiès, I. Perseil","doi":"10.1109/CCGRID.2019.00084","DOIUrl":"https://doi.org/10.1109/CCGRID.2019.00084","url":null,"abstract":"The OpenDose collaboration has been established to generate an open and traceable reference database of dosimetric data for nuclear medicine, using a variety of Monte Carlo codes. The amount of data to generate requires to run tens of thousands of simulations per anthropomorphic model, for a total computation time estimated to millions of CPU hours. To tackle this challenge, a project has been initiated to enable large scale data production with the Monte Carlo code GATE. Within this project, CRCT, Inserm CISI and CREATIS worked on developing solutions to run Gate simulations on the EGI grid infrastructure using existing tools such as VIP and GateLab. Those developments include a new GATE grid application deployed on VIP, modifications to the existing GateLab application, and the development of a client code using a REST API for using both. Developed tools have already allowed running 30% of GATE simulations for the first 2 models (adult male and adult female). On-going and future work includes improvements both to code and submission strategies, documentation and packaging of the code, definition and implementation of a long-term storage strategy, extension to other models, and generalisation of the tools to the other Monte Carlo codes used within the OpenDose collaboration.","PeriodicalId":234571,"journal":{"name":"2019 19th IEEE/ACM International Symposium on Cluster, Cloud and Grid Computing (CCGRID)","volume":"60 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2019-05-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"130730999","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 3
A Performance Driven Micro Services-Based Architecture/System for Analyzing Noisy IoT Data 基于性能驱动的微服务架构/系统分析物联网噪声数据
M. Bolic, S. Majumdar
{"title":"A Performance Driven Micro Services-Based Architecture/System for Analyzing Noisy IoT Data","authors":"M. Bolic, S. Majumdar","doi":"10.1109/CCGRID.2019.00031","DOIUrl":"https://doi.org/10.1109/CCGRID.2019.00031","url":null,"abstract":"The Internet of Things (IoT) technology presents a complex and challenging paradigm where a huge amount of noisy raw sensor data is collected in order to observe and detect critical events occurring on the system, and generate alarms when required. The biggest challenge of the IoT systems is that the systems collect a massive amount of uncertain data from diverse IoT devices connected through the network. In addition, some events are inferred from other events and uncertainty is propagated from parent events to the inferred events, which additionally contributes to overall system uncertainty. The observed complex events are a complex relationship of primitive events that are produced by IoT devices and collected in IoT systems. A survey performed on existing prior arts on quantifying uncertainty for complex events concluded that proposed existing solutions are unable to scale under heavy loads of incoming data. This paper presents a micro-service based notification methodology that uses complex event recognition (both complex event processing and probabilistic programming) to handle IoT systems uncertainty. In addition, the paper analyzes and recommends existing big data platforms for processing complex events in IoT systems. The current focus of our work includes research and development of the optimized deadline-based and cost-effective resource allocation algorithm in Apache Spark for Uncertain IoT Notification systems.","PeriodicalId":234571,"journal":{"name":"2019 19th IEEE/ACM International Symposium on Cluster, Cloud and Grid Computing (CCGRID)","volume":" 24","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2019-05-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"113952839","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 4
Scalable Video Transcoding in Public Clouds 公共云中可扩展的视频转码
Qingye Jiang, Young Choon Lee, Albert Y. Zomaya
{"title":"Scalable Video Transcoding in Public Clouds","authors":"Qingye Jiang, Young Choon Lee, Albert Y. Zomaya","doi":"10.1109/CCGRID.2019.00017","DOIUrl":"https://doi.org/10.1109/CCGRID.2019.00017","url":null,"abstract":"In this paper, we present the challenges involved in large-scale video transcoding application in public clouds. We introduce the architecture of an existing video transcoding system which is tightly coupled with an existing video sharing service. We examine the horizontal scalability of the video transcoding system on AWS EC2. With an online transaction processing (OLTP) model, the system achieves linear horizontal scalability up to 1,000 vCPU cores, but starts to experience performance degradation beyond that. We analyze the resource consumption pattern of the existing system, then introduce an improved architecture by adding a message queue layer. This effectively decouples the video transcoding system from the video sharing service and converts the OLTP model into a batch processing model. Large-scale evaluations on AWS EC2 indicate that the improved design maintains linear horizontal scalability at 10,100 vCPU cores. The hybrid design of the system allows it to be easily adapted for other batch processing use cases without the need to modify or recompile the application.","PeriodicalId":234571,"journal":{"name":"2019 19th IEEE/ACM International Symposium on Cluster, Cloud and Grid Computing (CCGRID)","volume":"408 23","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2019-05-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"114008055","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 5
Theoretical Scalability Analysis of Distributed Deep Convolutional Neural Networks 分布式深度卷积神经网络的理论可扩展性分析
Adrián Castelló, M. F. Dolz, E. S. Quintana‐Ortí, J. Duato
{"title":"Theoretical Scalability Analysis of Distributed Deep Convolutional Neural Networks","authors":"Adrián Castelló, M. F. Dolz, E. S. Quintana‐Ortí, J. Duato","doi":"10.1109/CCGRID.2019.00068","DOIUrl":"https://doi.org/10.1109/CCGRID.2019.00068","url":null,"abstract":"We analyze the asymptotic performance of the training process of deep neural networks (NN) on clusters in order to determine the scalability. For this purpose, i) we assume a data parallel implementation of the training algorithm, which distributes the batches among the cluster nodes and replicates the model; ii) we leverage the roofline model to inspect the performance at the node level, taking into account the floating-point unit throughput and memory bandwidth; and iii) we consider distinct collective communication schemes that are optimal depending on the message size and underlying network interconnection topology. We then apply the resulting performance model to analyze the scalability of several well-known deep convolutional neural networks as a function of the batch size, node floating-point throughput, node memory bandwidth, cluster dimension, and link bandwidth.","PeriodicalId":234571,"journal":{"name":"2019 19th IEEE/ACM International Symposium on Cluster, Cloud and Grid Computing (CCGRID)","volume":"16 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2019-05-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"121525513","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 12
Scalability of the NewMadeleine Communication Library for Large Numbers of MPI Point-to-Point Requests 针对大量MPI点对点请求的NewMadeleine通信库的可扩展性
Alexandre Denis
{"title":"Scalability of the NewMadeleine Communication Library for Large Numbers of MPI Point-to-Point Requests","authors":"Alexandre Denis","doi":"10.1109/CCGRID.2019.00051","DOIUrl":"https://doi.org/10.1109/CCGRID.2019.00051","url":null,"abstract":"New kinds of applications with lots of threads or irregular communication patterns which rely a lot on point-to-point MPI communications have emerged. It stresses the MPI library with potentially a lot of simultaneous MPI requests for sending and receiving at the same time. To deal with large numbers of simultaneous requests, the bottleneck lies in two main mechanisms: the tag-matching (the algorithm that matches an incoming packet with a posted receive request), and the progression engine. In this paper, we propose algorithms and implementations that overcome these issues so as to scale up to thousands of requests if needed. In particular our algorithms are able to perform constant-time tag-matching even with any-source and any-tag support. We have implemented these mechanisms in our NewMadeleine communication library. Through micro-benchmarks and computation kernel benchmarks, we demonstrate that our MPI library exhibits better performance than state-of-the-art MPI implementations in cases with many simultaneous requests.","PeriodicalId":234571,"journal":{"name":"2019 19th IEEE/ACM International Symposium on Cluster, Cloud and Grid Computing (CCGRID)","volume":"35 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2019-05-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"126090065","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 6
Application-Level Differential Checkpointing for HPC Applications with Dynamic Datasets 具有动态数据集的HPC应用程序的应用级差分检查点
Kai Keller, L. Bautista-Gomez
{"title":"Application-Level Differential Checkpointing for HPC Applications with Dynamic Datasets","authors":"Kai Keller, L. Bautista-Gomez","doi":"10.1109/CCGRID.2019.00015","DOIUrl":"https://doi.org/10.1109/CCGRID.2019.00015","url":null,"abstract":"High-performance computing (HPC) requires resilience techniques such as checkpointing in order to tolerate failures in supercomputers. As the number of nodes and memory in supercomputers keeps on increasing, the size of checkpoint data also increases dramatically, sometimes causing an I/O bottleneck. Differential checkpointing (dCP) aims to minimize the checkpointing overhead by only writing data differences. This is typically implemented at the memory page level, sometimes complemented with hashing algorithms. However, such a technique is unable to cope with dynamic-size datasets. In this work, we present a novel dCP implementation with a new file format that allows fragmentation of protected datasets in order to support dynamic sizes. We identify dirty data blocks using hash algorithms. In order to evaluate the dCP performance, we ported the HPC applications xPic, LULESH 2.0 and Heat2D and analyze them regarding their potential of reducing I/O with dCP and how this data reduction influences the checkpoint performance. In our experiments, we achieve reductions of up to 62% of the checkpoint time.","PeriodicalId":234571,"journal":{"name":"2019 19th IEEE/ACM International Symposium on Cluster, Cloud and Grid Computing (CCGRID)","volume":"32 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2019-05-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"133529663","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 7
Welcome from the General Chairs 欢迎各位主席
{"title":"Welcome from the General Chairs","authors":"","doi":"10.1109/ccgrid.2019.00005","DOIUrl":"https://doi.org/10.1109/ccgrid.2019.00005","url":null,"abstract":"","PeriodicalId":234571,"journal":{"name":"2019 19th IEEE/ACM International Symposium on Cluster, Cloud and Grid Computing (CCGRID)","volume":"59 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2019-05-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"123417550","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
0
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
确定
请完成安全验证×
相关产品
×
本文献相关产品
联系我们:info@booksci.cn Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。 Copyright © 2023 布克学术 All rights reserved.
京ICP备2023020795号-1
ghs 京公网安备 11010802042870号
Book学术文献互助
Book学术文献互助群
群 号:604180095
Book学术官方微信