2011 11th IEEE/ACM International Symposium on Cluster, Cloud and Grid Computing最新文献

筛选
英文 中文
Small Discrete Fourier Transforms on GPUs gpu上的小离散傅里叶变换
S. Mitra, A. Srinivasan
{"title":"Small Discrete Fourier Transforms on GPUs","authors":"S. Mitra, A. Srinivasan","doi":"10.1109/CCGrid.2011.14","DOIUrl":"https://doi.org/10.1109/CCGrid.2011.14","url":null,"abstract":"Efficient implementations of the Discrete Fourier Transform (DFT) for GPUs provide good performance with large data sizes, but are not competitive with CPU code for small data sizes. On the other hand, several applications perform multiple DFTs on small data sizes. In fact, even algorithms for large data sizes use a divide-and-conquer approach, where eventually small DFTs need to be performed. We discuss our DFT implementation, which is efficient for multiple small DFTs. One feature of our implementation is the use of the asymptotically slow matrix multiplication approach for small data sizes, which improves performance on the GPU due to its regular memory access and computational patterns. We combine this algorithm with the mixed radix algorithm for 1-D, 2-D, and 3-D complex DFTs. We also demonstrate the effect of different optimization techniques. When GPUs are used to accelerate a component of an application running on the host, it is important that decisions taken to optimize the GPU performance not affect the performance of the rest of the application on the host. One feature of our implementation is that we use a data layout that is not optimal for the GPU so that the overall effect on the application is better. Our implementation performs up to two orders of magnitude faster than cuFFT on an NVIDIA GeForce 9800 GTX GPU and up to one to two orders of magnitude faster than FFTW on a CPU for multiple small DFTs. Furthermore, we show that our implementation can accelerate the performance of a Quantum Monte Carlo application for which cuFFT is not effective. The primary contributions of this work lie in demonstrating the utility of the matrix multiplication approach and also in providing an implementation that is efficient for small DFTs when a GPU is used to accelerate an application running on the host.","PeriodicalId":376385,"journal":{"name":"2011 11th IEEE/ACM International Symposium on Cluster, Cloud and Grid Computing","volume":"14 8 Pt 1 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2011-05-23","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"123722014","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 9
EZTrace: A Generic Framework for Performance Analysis EZTrace:性能分析的通用框架
François Trahay, François Rué, Mathieu Faverge, Y. Ishikawa, R. Namyst, J. Dongarra
{"title":"EZTrace: A Generic Framework for Performance Analysis","authors":"François Trahay, François Rué, Mathieu Faverge, Y. Ishikawa, R. Namyst, J. Dongarra","doi":"10.1109/CCGrid.2011.83","DOIUrl":"https://doi.org/10.1109/CCGrid.2011.83","url":null,"abstract":"Modern supercomputers with multi-core nodes enhanced by accelerators, as well as hybrid programming models introduce more complexity in modern applications. Exploiting efficiently all the resources requires a complex analysis of the performance of applications in order to detect time-consuming sections. We present eztrace, a generic trace generation framework that aims at providing a simple way to analyze applications. eztrace is based on plugins that allow it to trace different programming models such as MPI, pthread or OpenMP as well as user-defined libraries or applications. eztrace uses two steps: one to collect the basic information during execution and one post-mortem analysis. This permits tracing the execution of applications with low overhead while allowing to refine the analysis after the execution. We also present a script language for eztrace that gives the user the opportunity to easily define the functions to instrument without modifying the source code of the application.","PeriodicalId":376385,"journal":{"name":"2011 11th IEEE/ACM International Symposium on Cluster, Cloud and Grid Computing","volume":"21 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2011-05-23","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"125285354","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 48
Engineering Incentives in Social Clouds 社会云中的工程激励
Christian Haas, Simon Caton, Christof Weinhardt
{"title":"Engineering Incentives in Social Clouds","authors":"Christian Haas, Simon Caton, Christof Weinhardt","doi":"10.1109/CCGrid.2011.52","DOIUrl":"https://doi.org/10.1109/CCGrid.2011.52","url":null,"abstract":"Combining the strengths of the Cloud Computing and Social Network paradigms, the vision of Social Clouds aims to provide a resource sharing mechanism where participants dynamically share and trade resources on the premise of the relationships encoded in a social network. By building upon existing relationships in social networks and the inherent trust that accompanies these relationships, a Social Cloud is able to address one of the most cited obstacles in the adoption of current cloud solutions, the missing trust between Cloud service providers and users. However, as with other computing approaches relying on user participation, incentivisation of potential and existing participants is crucial for the success and sustainability of a Social Cloud. Therefore, an incentive engineering approach is needed and will be discussed in this paper considering all phases of user participation in a Social Cloud in order to provide proper incentives for active user participation and desired user behavior.","PeriodicalId":376385,"journal":{"name":"2011 11th IEEE/ACM International Symposium on Cluster, Cloud and Grid Computing","volume":"21 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2011-05-23","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"126702173","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 17
Building an Online Domain-Specific Computing Service over Non-dedicated Grid and Cloud Resources: The Superlink-Online Experience 在非专用网格和云资源上构建在线特定领域计算服务:超级链接在线体验
M. Silberstein
{"title":"Building an Online Domain-Specific Computing Service over Non-dedicated Grid and Cloud Resources: The Superlink-Online Experience","authors":"M. Silberstein","doi":"10.1109/CCGrid.2011.46","DOIUrl":"https://doi.org/10.1109/CCGrid.2011.46","url":null,"abstract":"Linkage analysis is a statistical method used by geneticists in everyday practice for mapping disease-susceptibility genes in the study of complex diseases. An essential first step in the study of genetic diseases, linkage computations may require years of CPU time. The recent DNA sampling revolution enabled unprecedented sampling density, but made the analysis even more computationally demanding. In this paper we describe a high performance online service for genetic linkage analysis, called Super link-online. The system enables anyone with Internet access to submit genetic data and analyze it as easily and quickly as if using a supercomputer. The analyses are automatically parallelized and executed on tens of thousands distributed CPUs in multiple clouds and grids. The first version of the system, which employed up to 3,000 CPUs in UW Madison and Technion Condor pools, has been successfully used since 2006 by hundreds of geneticists worldwide, with over 40 citations in the genetics literature. Here we describe the second version, which substantially improves the scalability and performance of first: it uses over 45,000 non-dedicated hosts, in 10different grids and clouds, including EC2 and the Superlink@Technion community grid. Improved system performance is obtained through a virtual grid hierarchy with dynamic load balancing and multi-grid overlay via the Grid Bot system, parallel pruning of short tasks for overhead minimization, and cost-efficient use of cloud resources in reliability-critical execution periods. These enhancements enabled execution of many previously infeasible analyses, which can now be completed within a few hours. The new version of the system, in production since 2009, has completed over 6500 different runs of over 10 million tasks, with total consumption of 420 CPU years.","PeriodicalId":376385,"journal":{"name":"2011 11th IEEE/ACM International Symposium on Cluster, Cloud and Grid Computing","volume":"8 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2011-05-23","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"127011544","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 16
PAC-PLRU: A Cache Replacement Policy to Salvage Discarded Predictions from Hardware Prefetchers PAC-PLRU:从硬件预取器中回收丢弃预测的缓存替换策略
Ke Zhang, Zhensong Wang, Yong Chen, Huaiyu Zhu, Xian-He Sun
{"title":"PAC-PLRU: A Cache Replacement Policy to Salvage Discarded Predictions from Hardware Prefetchers","authors":"Ke Zhang, Zhensong Wang, Yong Chen, Huaiyu Zhu, Xian-He Sun","doi":"10.1109/CCGrid.2011.27","DOIUrl":"https://doi.org/10.1109/CCGrid.2011.27","url":null,"abstract":"Cache replacement policy plays an important role in guaranteeing the availability of cache blocks, reducing miss rates, and improving applications' overall performance. However, recent research efforts on improving replacement policies require either significant additional hardware or major modifications to the organization of the existing cache. In this study, we propose the PAC-PLRU cache replacement policy. PAC-PLRU not only utilizes but also judiciously salvages the prediction information discarded from a widely-adopted stride prefetcher. The main idea behind PAC-PLRU is utilizing the prediction results generated by the existing stride prefetcher and preventing these predicted cache blocks from being replaced in the near future. Experimental results show that leveraging the PAC-PLRU with a stride prefetcher reduces the average L2 cache miss rate by 91% over a baseline system with only PLRU policy, and by 22% over a system using PLRU with an unconnected stride prefetcher. Most importantly, PAC-PLRU only requires minor modifications to existing cache architecture to get these benefits. The proposed PAC-PLRU policy is promising in fostering the connection between prefetching and replacement policies, and have a lasting impact on improving the overall cache performance.","PeriodicalId":376385,"journal":{"name":"2011 11th IEEE/ACM International Symposium on Cluster, Cloud and Grid Computing","volume":"90 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2011-05-23","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"127182786","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 9
DHTbd: A Reliable Block-Based Storage System for High Performance Clusters DHTbd:面向高性能集群的可靠块存储系统
G. Parisis, G. Xylomenos, T. Apostolopoulos
{"title":"DHTbd: A Reliable Block-Based Storage System for High Performance Clusters","authors":"G. Parisis, G. Xylomenos, T. Apostolopoulos","doi":"10.1109/CCGrid.2011.59","DOIUrl":"https://doi.org/10.1109/CCGrid.2011.59","url":null,"abstract":"Large, reliable and efficient storage systems are becoming increasingly important in enterprise environments. Our research in storage system design is oriented towards the exploitation of commodity hardware for building a high performance, resilient and scalable storage system. We present the design and implementation of DHTbd, a general purpose decentralized storage system where storage nodes support a distributed hash table based interface and clients are implemented as in-kernel device drivers. DHTbd, unlike most storage systems proposed to date, is implemented at the block device level of the I/O stack, a simple yet efficient design. The experimental evaluation of the proposed system demonstrates its very good I/O performance, its ability to scale to large clusters, as well as its robustness, even when massive failures occur.","PeriodicalId":376385,"journal":{"name":"2011 11th IEEE/ACM International Symposium on Cluster, Cloud and Grid Computing","volume":"79 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2011-05-23","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"116996108","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 8
On the Performance Variability of Production Cloud Services 关于生产云服务的性能可变性
A. Iosup, N. Yigitbasi, D. Epema
{"title":"On the Performance Variability of Production Cloud Services","authors":"A. Iosup, N. Yigitbasi, D. Epema","doi":"10.1109/CCGRID.2011.22","DOIUrl":"https://doi.org/10.1109/CCGRID.2011.22","url":null,"abstract":"Cloud computing is an emerging infrastructure paradigm that promises to eliminate the need for companies to maintain expensive computing hardware. Through the use of virtualization and resource time-sharing, clouds address with a single set of physical resources a large user base with diverse needs. Thus, clouds have the potential to provide their owners the benefits of an economy of scale and, at the same time, become an alternative for both the industry and the scientific community to self-owned clusters, grids, and parallel production environments. For this potential to become reality, the first generation of commercial clouds need to be proven to be dependable. In this work we analyze the dependability of cloud services. Towards this end, we analyze long-term performance traces from Amazon Web Services and Google App Engine, currently two of the largest commercial clouds in production. We find that the performance of about half of the cloud services we investigate exhibits yearly and daily patterns, but also that most services have periods of especially stable performance. Last, through trace-based simulation we assess the impact of the variability observed for the studied cloud services on three large-scale applications, job execution in scientific computing, virtual goods trading in social networks, and state management in social gaming. We show that the impact of performance variability depends on the application, and give evidence that performance variability can be an important factor in cloud provider selection.","PeriodicalId":376385,"journal":{"name":"2011 11th IEEE/ACM International Symposium on Cluster, Cloud and Grid Computing","volume":"19 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2011-05-23","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"129834090","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 365
A Flexible Policy Framework for the QoS Differentiated Provisioning of Services 灵活的QoS差异化业务发放策略框架
Mohan Baruwal Chhetri, Quoc Bao Vo, R. Kowalczyk
{"title":"A Flexible Policy Framework for the QoS Differentiated Provisioning of Services","authors":"Mohan Baruwal Chhetri, Quoc Bao Vo, R. Kowalczyk","doi":"10.1109/CCGrid.2011.36","DOIUrl":"https://doi.org/10.1109/CCGrid.2011.36","url":null,"abstract":"We propose a policy-based framework for the QoS differentiated provisioning of services. The proposed frame-work improves the state-of-the-art in policy-based preferencespecification by combining cardinal and ordinal preferences. We describe the underlying models, focussing on the key features and contributions of the proposed framework. We also show how, using our framework, the QoS evaluation problem can be translated to a Constraint Satisfaction Problem while preserving the semantics of the preference policies.","PeriodicalId":376385,"journal":{"name":"2011 11th IEEE/ACM International Symposium on Cluster, Cloud and Grid Computing","volume":"9 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2011-05-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"121263098","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 7
Optimized Management of Power and Performance for Virtualized Heterogeneous Server Clusters 虚拟化异构服务器集群的电源和性能优化管理
V. Petrucci, E. V. Carrera, O. Loques, J. Leite, D. Mossé
{"title":"Optimized Management of Power and Performance for Virtualized Heterogeneous Server Clusters","authors":"V. Petrucci, E. V. Carrera, O. Loques, J. Leite, D. Mossé","doi":"10.1109/CCGrid.2011.15","DOIUrl":"https://doi.org/10.1109/CCGrid.2011.15","url":null,"abstract":"This paper proposes and evaluates an approach for power and performance management in virtualized server clusters. The major goal of our approach is to reduce power consumption in the cluster while meeting performance requirements. The contributions of this paper are: (1) a simple but effective way of modeling power consumption and capacity of servers even under heterogeneous and changing workloads, and (2) an optimization strategy based on a mixed integer programming model for achieving improvements on power-efficiency while providing performance guarantees in the virtualized cluster. In the optimization model, we address application workload balancing and the often ignored switching costs due to frequent and undesirable turning servers on/off and VM relocations. We show the effectiveness of the approach applied to a server cluster test bed. Our experiments show that our approach conserves about 50% of the energy required by a system designed for peak workload scenario, with little impact on the applications' performance goals. Also, by using prediction in our optimization strategy, further QoS improvement was achieved.","PeriodicalId":376385,"journal":{"name":"2011 11th IEEE/ACM International Symposium on Cluster, Cloud and Grid Computing","volume":"21 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2011-05-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"127106763","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 93
Efficient Support for MPI-I/O Atomicity Based on Versioning 基于版本控制的MPI-I/O原子性的有效支持
Viet-Trung Tran, Bogdan Nicolae, Gabriel Antoniu, L. Bougé
{"title":"Efficient Support for MPI-I/O Atomicity Based on Versioning","authors":"Viet-Trung Tran, Bogdan Nicolae, Gabriel Antoniu, L. Bougé","doi":"10.1109/CCGrid.2011.60","DOIUrl":"https://doi.org/10.1109/CCGrid.2011.60","url":null,"abstract":"We consider the challenge of building data management systems that meet an important requirement of today's data-intensive HPC applications: to provide a high I/O throughput while supporting highly concurrent data accesses. In this context, many applications rely on MPI-I/O and require atomic, non-contiguous I/O operations that concurrently access shared data. In most existing implementations, the atomicity requirement is implemented through locking-based schemes, which have proven inefficient, especially for non-contiguous I/O. We claim that using a versioning-enabled storage back-end has the potential to avoid the expensive synchronization induced by locking-based schemes. We describe a prototype implementation on top of ROMIO, and report on promising experimental results with standard MPI-I/O benchmarks specifically designed to evaluate the performance of non-contiguous, overlapped I/O accesses under MPI atomicity guarantees.","PeriodicalId":376385,"journal":{"name":"2011 11th IEEE/ACM International Symposium on Cluster, Cloud and Grid Computing","volume":"35 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2010-12-15","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"114745511","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 6
0
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
确定
请完成安全验证×
相关产品
×
本文献相关产品
联系我们:info@booksci.cn Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。 Copyright © 2023 布克学术 All rights reserved.
京ICP备2023020795号-1
ghs 京公网安备 11010802042870号
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术官方微信