2015 15th IEEE/ACM International Symposium on Cluster, Cloud and Grid Computing最新文献

筛选
英文 中文
Scaling NWChem with Efficient and Portable Asynchronous Communication in MPI RMA 基于MPI RMA的高效可移植异步通信扩展NWChem
Min Si, Antonio J. Peña, J. Hammond, P. Balaji, Y. Ishikawa
{"title":"Scaling NWChem with Efficient and Portable Asynchronous Communication in MPI RMA","authors":"Min Si, Antonio J. Peña, J. Hammond, P. Balaji, Y. Ishikawa","doi":"10.1109/CCGrid.2015.48","DOIUrl":"https://doi.org/10.1109/CCGrid.2015.48","url":null,"abstract":"NWChem is one of the most widely used computational chemistry application suites for chemical and biological systems. Despite its vast success, the computational efficiency of NWChem is still low. This is especially true in higher accuracy methods such as the CCSD(T) coupled cluster method, where it currently achieves a mere 50% computational efficiency when run at large scales. In this paper, we demonstrate the most computationally efficient scaling of NWChem CCSD(T) to date, and use it to solve large water clusters. We use our recently proposed process-based asynchronous progress framework for MPI RMA, called Casper, to scale the computation on water clusters at near-100% computational efficiency on up to 12288 cores.","PeriodicalId":6664,"journal":{"name":"2015 15th IEEE/ACM International Symposium on Cluster, Cloud and Grid Computing","volume":"3 1","pages":"811-816"},"PeriodicalIF":0.0,"publicationDate":"2015-05-04","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"90186568","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 7
Taming Latency in Data Center Networking with Erasure Coded Files 用Erasure编码文件控制数据中心网络中的延迟
2015 15th IEEE/ACM International Symposium on Cluster, Cloud and Grid Computing Pub Date : 2015-05-04 DOI: 10.1109/CCGrid.2015.142
Yu Xiang, V. Aggarwal, Y. Chen, Tian Lan
{"title":"Taming Latency in Data Center Networking with Erasure Coded Files","authors":"Yu Xiang, V. Aggarwal, Y. Chen, Tian Lan","doi":"10.1109/CCGrid.2015.142","DOIUrl":"https://doi.org/10.1109/CCGrid.2015.142","url":null,"abstract":"This paper proposes an approach to minimize service latency in a data center network where erasure-coded files are stored on distributed disks/racks and access requests are scattered across the network. Due to limited bandwidth available at both top-of-the-rack and aggregation switches, network bandwidth must be apportioned among different intra-and inter-rack data flows in line with their traffic statistics. We formulate this problem as weighted queuing and employ a class of probabilistic request scheduling policies to derive a closed-form outer-bound of service latency for erasure-coded storage with arbitrary file access patterns and service time distributions. The result enables us to propose a joint latency optimization over three entangled \"control knobs\": the bandwidth allocation at top-of-the-rack and aggregation switches, the probabilities for scheduling file requests, and the placement of encoded file chunks, which affects data locality. The joint optimization is shown to be a mixed-integer problem. We develop an iterative algorithm which decouples and solves the joint optimization as three sub-problems, which are either convex or solvable via bipartite matching in polynomial time. The proposed algorithm is prototyped in an open-source, distributed file system, Tahoe, and evaluated on a cloud tested with 16 separate physical hosts in an Open Stack cluster. Experiments validate our theoretical latency analysis and show significant latency reduction for diverse file access patterns. The results provide valuable insight on designing low-latency data center networks with erasure-coded storage.","PeriodicalId":6664,"journal":{"name":"2015 15th IEEE/ACM International Symposium on Cluster, Cloud and Grid Computing","volume":"45 1","pages":"241-250"},"PeriodicalIF":0.0,"publicationDate":"2015-05-04","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"88296218","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 9
Cloud-Based Machine Learning Tools for Enhanced Big Data Applications 增强大数据应用的基于云的机器学习工具
2015 15th IEEE/ACM International Symposium on Cluster, Cloud and Grid Computing Pub Date : 2015-05-04 DOI: 10.1109/CCGrid.2015.170
A. Cuzzocrea, E. Mumolo, P. Corona
{"title":"Cloud-Based Machine Learning Tools for Enhanced Big Data Applications","authors":"A. Cuzzocrea, E. Mumolo, P. Corona","doi":"10.1109/CCGrid.2015.170","DOIUrl":"https://doi.org/10.1109/CCGrid.2015.170","url":null,"abstract":"We propose Cloud-based machine learning tools for enhanced Big Data applications, where the main idea is that of predicting the \"next\" workload occurring against the target Cloud infrastructure via an innovative ensemble-based approach that combine the effectiveness of different well-known classifiers in order to enhance the whole accuracy of the final classification, which is very relevant at now in the specific context of Big Data. So-called workload categorization problem plays a critical role towards improving the efficiency and the reliability of Cloud-based big data applications. Implementation-wise, our method proposes deploying Cloud entities that participate to the distributed classification approach on top of virtual machines, which represent classical \"commodity\" settings for Cloud-based big data applications. Preliminary experimental assessment and analysis clearly confirm the benefits deriving from our classification framework.","PeriodicalId":6664,"journal":{"name":"2015 15th IEEE/ACM International Symposium on Cluster, Cloud and Grid Computing","volume":"60 1","pages":"908-914"},"PeriodicalIF":0.0,"publicationDate":"2015-05-04","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"78854926","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 3
Study of the KVM CPU Performance of Open-Source Cloud Management Platforms 开源云管理平台的KVM CPU性能研究
2015 15th IEEE/ACM International Symposium on Cluster, Cloud and Grid Computing Pub Date : 2015-05-04 DOI: 10.1109/CCGrid.2015.103
F. Gomez-Folgar, A. García-Loureiro, T. F. Pena, J. I. Zablah, N. Seoane
{"title":"Study of the KVM CPU Performance of Open-Source Cloud Management Platforms","authors":"F. Gomez-Folgar, A. García-Loureiro, T. F. Pena, J. I. Zablah, N. Seoane","doi":"10.1109/CCGrid.2015.103","DOIUrl":"https://doi.org/10.1109/CCGrid.2015.103","url":null,"abstract":"Nowadays, there are several open-source solutions for building private, public and even hybrid clouds such as Eucalyptus, Apache Cloud Stack and Open Stack. KVM is one of the supported hypervisors for these cloud platforms. Different KVM configurations are being supplied by these platforms and, in some cases, a subset of CPU features are being presented to guest systems, providing a basic abstraction of the underlying CPU. One of the reasons for limiting the features of the Virtual CPU is to guarantee the guest compatibility with different hardware in heterogeneous environments. However, in a large number of situations, the cloud is deployed on an homogeneous set of hosts. In these cases, this limitation can affect the performance of applications being executed in guest systems. In this paper, we have analyzed the architecture, the KVM setup, and the performance of the Virtual Machines deployed by three popular cloud management platforms: Eucalyptus, Apache Cloud Stack and Open Stack, employing a representative set of applications.","PeriodicalId":6664,"journal":{"name":"2015 15th IEEE/ACM International Symposium on Cluster, Cloud and Grid Computing","volume":"12 2","pages":"1225-1228"},"PeriodicalIF":0.0,"publicationDate":"2015-05-04","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"72614819","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 7
Analyzing MPI-3.0 Process-Level Shared Memory: A Case Study with Stencil Computations MPI-3.0进程级共享内存分析:以模板计算为例
2015 15th IEEE/ACM International Symposium on Cluster, Cloud and Grid Computing Pub Date : 2015-05-04 DOI: 10.1109/CCGrid.2015.131
Xiaomin Zhu, Junchao Zhang, Kazutomo Yoshii, Shigang Li, Yunquan Zhang, P. Balaji
{"title":"Analyzing MPI-3.0 Process-Level Shared Memory: A Case Study with Stencil Computations","authors":"Xiaomin Zhu, Junchao Zhang, Kazutomo Yoshii, Shigang Li, Yunquan Zhang, P. Balaji","doi":"10.1109/CCGrid.2015.131","DOIUrl":"https://doi.org/10.1109/CCGrid.2015.131","url":null,"abstract":"The recently released MPI-3.0 standard introduced a process-level shared-memory interface which enables processes within the same node to have direct load/store access to each others' memory. Such an interface allows applications to declare data structures that are shared by multiple MPI processes on the node. In this paper, we study the capabilities and performance implications of using MPI-3.0 shared memory, in the context of a five-point stencil computation. Our analysis reveals that the use of MPI-3.0 shared memory has several unforeseen performance implications including disrupting certain compiler optimizations and incorrectly using suboptimal page sizes inside the OS. Based on this analysis, we propose several methodologies for working around these issues and improving communication performance by 40-85% compared to the current MPI-1.0 based approach.","PeriodicalId":6664,"journal":{"name":"2015 15th IEEE/ACM International Symposium on Cluster, Cloud and Grid Computing","volume":"15 1","pages":"1099-1106"},"PeriodicalIF":0.0,"publicationDate":"2015-05-04","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"90791599","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 8
Parallel DC3 Algorithm for Suffix Array Construction on Many-Core Accelerators 多核加速器上后缀阵列构建的并行DC3算法
Gang Liao, Longfei Ma, Guangming Zang, L. Tang
{"title":"Parallel DC3 Algorithm for Suffix Array Construction on Many-Core Accelerators","authors":"Gang Liao, Longfei Ma, Guangming Zang, L. Tang","doi":"10.1109/CCGrid.2015.56","DOIUrl":"https://doi.org/10.1109/CCGrid.2015.56","url":null,"abstract":"In bioinformatics applications, suffix arrays are widely used to DNA sequence alignments in the initial exact match phase of heuristic algorithms. With the exponential growth and availability of data, using many-core accelerators, like GPUs, to optimize existing algorithms is very common. We present a new implementation of suffix array on GPU. As a result, suffix array construction on GPU achieves around 10x speedup on standard large data sets, which contain more than 100 million characters. The idea is simple, fast and scalable that can be easily scale to multi-core processors and even heterogeneous architectures.","PeriodicalId":6664,"journal":{"name":"2015 15th IEEE/ACM International Symposium on Cluster, Cloud and Grid Computing","volume":"1 1","pages":"1155-1158"},"PeriodicalIF":0.0,"publicationDate":"2015-05-04","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"89703154","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 4
MVAPICH2 over OpenStack with SR-IOV: An Efficient Approach to Build HPC Clouds 基于SR-IOV的MVAPICH2 over OpenStack:构建高性能计算云的有效方法
2015 15th IEEE/ACM International Symposium on Cluster, Cloud and Grid Computing Pub Date : 2015-05-04 DOI: 10.1109/CCGrid.2015.166
Jie Zhang, Xiaoyi Lu, Mark Daniel Arnold, D. Panda
{"title":"MVAPICH2 over OpenStack with SR-IOV: An Efficient Approach to Build HPC Clouds","authors":"Jie Zhang, Xiaoyi Lu, Mark Daniel Arnold, D. Panda","doi":"10.1109/CCGrid.2015.166","DOIUrl":"https://doi.org/10.1109/CCGrid.2015.166","url":null,"abstract":"Cloud Computing with Virtualization offers attractive flexibility and elasticity to deliver resources by providing a platform for consolidating complex IT resources in a scalable manner. However, efficiently running HPC applications on Cloud Computing systems is still full of challenges. One of the biggest hurdles in building efficient HPC clouds is the unsatisfactory performance offered by underlying virtualized environments, more specifically, virtualized I/O devices. Recently, Single Root I/O Virtualization (SR-IOV) technology has been steadily gaining momentum for high-performance interconnects such as InfiniBand and 10GigE. Due to its near native performance for inter-node communication, many cloud systems such as Amazon EC2 have been using SR-IOV in their production environments. Nevertheless, recent studies have shown that the SR-IOV scheme lacks locality aware communication support, which leads to performance overheads for inter-VM communication within the same physical node. In this paper, we propose an efficient approach to build HPC clouds based on MVAPICH2 over Open Stack with SR-IOV. We first propose an extension for Open Stack Nova system to enable the IV Shmem channel in deployed virtual machines. We further present and discuss our high-performance design of virtual machine aware MVAPICH2 library over Open Stack-based HPC Clouds. Our design can fully take advantage of high-performance SR-IOV communication for inter-node communication as well as Inter-VM Shmem (IVShmem) for intra-node communication. A comprehensive performance evaluation with micro-benchmarks and HPC applications has been conducted on an experimental Open Stack-based HPC cloud and Amazon EC2. The evaluation results on the experimental HPC cloud show that our design and extension can deliver near bare-metal performance for implementing SR-IOV-based HPC clouds with virtualization. Further, compared with the performance on EC2, our experimental HPC cloud can exhibit up to 160X, 65X, 12X improvement potential in terms of point-to-point, collective and application for future HPC clouds.","PeriodicalId":6664,"journal":{"name":"2015 15th IEEE/ACM International Symposium on Cluster, Cloud and Grid Computing","volume":"31 1","pages":"71-80"},"PeriodicalIF":0.0,"publicationDate":"2015-05-04","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"79333798","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 22
Eliminating the Redundancy in MapReduce-Based Entity Resolution 消除基于mapreduce的实体解析中的冗余
Cairong Yan, Yalong Song, Jian Wang, Wenjing Guo
{"title":"Eliminating the Redundancy in MapReduce-Based Entity Resolution","authors":"Cairong Yan, Yalong Song, Jian Wang, Wenjing Guo","doi":"10.1109/CCGrid.2015.24","DOIUrl":"https://doi.org/10.1109/CCGrid.2015.24","url":null,"abstract":"Entity resolution is the basic operation of data quality management, and the key step to find the value of data. The parallel data processing framework based on MapReduce can deal with the challenge brought by big data. However, there exist two important issues, avoiding redundant pairs led by the multi-pass blocking method and optimizing candidate pairs based on the transitive relations of similarity. In this paper, we propose a multi-signature based parallel entity resolution method, called multi-sig-er, which supports unstructured data and structured data. Two redundancy elimination strategies are adopted to prune the candidate pairs and reduce the number of similarity computation without affecting the resolution accuracy. Experimental results on real-world datasets show that our method tends to handle large datasets and it is more suitable for complex similarity computation than simple object matching.","PeriodicalId":6664,"journal":{"name":"2015 15th IEEE/ACM International Symposium on Cluster, Cloud and Grid Computing","volume":"6 1","pages":"1233-1236"},"PeriodicalIF":0.0,"publicationDate":"2015-05-04","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"90429897","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 10
Toward Implementing Robust Support for Portals 4 Networks in MPICH 在MPICH中实现对门户网络的强大支持
Kenneth Raffenetti, Antonio J. Peña, P. Balaji
{"title":"Toward Implementing Robust Support for Portals 4 Networks in MPICH","authors":"Kenneth Raffenetti, Antonio J. Peña, P. Balaji","doi":"10.1109/CCGrid.2015.79","DOIUrl":"https://doi.org/10.1109/CCGrid.2015.79","url":null,"abstract":"The Portals 4 network specification is a low-levelAPI for high-performance networks developed by Sandia National Laboratories, Intel Corporation, and the University of NewMexico. Portals 4 is specifically designed to support both the MPIand PGAS programming models efficiently by providing building blocks upon which to implement their particular features. In this paper we discuss our ongoing efforts to add efficient and robust support for Portals 4 networks inside MPICH, and we describe how the API semantics influenced our design. In particular, we found the lack of reliability guarantees from the Portals4 layer challenging to address. To tackle this situation, we implemented an intermediate layer - Rportals (reliable Portals), which modularizes the reliability functionality within our Portals network module for MPICH. In this paper we present theRportals design and its performance impact.","PeriodicalId":6664,"journal":{"name":"2015 15th IEEE/ACM International Symposium on Cluster, Cloud and Grid Computing","volume":"76 1","pages":"1173-1176"},"PeriodicalIF":0.0,"publicationDate":"2015-05-04","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"90587044","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 4
Assessing Memory Access Performance of Chapel through Synthetic Benchmarks 通过综合基准评估Chapel的内存访问性能
2015 15th IEEE/ACM International Symposium on Cluster, Cloud and Grid Computing Pub Date : 2015-05-04 DOI: 10.1109/CCGrid.2015.157
Engin Kayraklioglu, T. El-Ghazawi
{"title":"Assessing Memory Access Performance of Chapel through Synthetic Benchmarks","authors":"Engin Kayraklioglu, T. El-Ghazawi","doi":"10.1109/CCGrid.2015.157","DOIUrl":"https://doi.org/10.1109/CCGrid.2015.157","url":null,"abstract":"The Partitioned Global Address Space(PGAS) programming model strikes a balance between high performance and locality awareness. As a PGAS language, Chapel relieves programmers from handling details of data movement in a distributed memory environment, by presenting a flat memory space that is logically partitioned among executing entities. Traversing such a space requires address mapping to the system virtual address space, and as such, this abstraction inevitably causes major overheads during memory accesses. In this paper, we analyzed the extent of this overhead by implementing a micro benchmark to test different types of memory accesses that can be observed in Chapel. We showed that, as the locality gets exploited speedup gains up to 35x can be achieved. This was demonstrated through hand tuning, however. More productive means should be provided to deliver such performance improvement without excessively burdening programmers. Therefore, we also discuss possibilities to increase Chapel's performance through standard libraries, compiler, runtime and/or hardware support to handle different types of memory accesses more efficiently.","PeriodicalId":6664,"journal":{"name":"2015 15th IEEE/ACM International Symposium on Cluster, Cloud and Grid Computing","volume":"7 1","pages":"1147-1150"},"PeriodicalIF":0.0,"publicationDate":"2015-05-04","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"78436529","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 3
0
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
确定
请完成安全验证×
相关产品
×
本文献相关产品
联系我们:info@booksci.cn Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。 Copyright © 2023 布克学术 All rights reserved.
京ICP备2023020795号-1
ghs 京公网安备 11010802042870号
Book学术文献互助
Book学术文献互助群
群 号:604180095
Book学术官方微信