2010 10th IEEE/ACM International Conference on Cluster, Cloud and Grid Computing最新文献

筛选
英文 中文
Efficient Runtime Environment for Coupled Multi-physics Simulations: Dynamic Resource Allocation and Load-Balancing 耦合多物理场仿真的高效运行环境:动态资源分配和负载平衡
S. Ko, Nayong Kim, Joohyun Kim, A. Thota, S. Jha
{"title":"Efficient Runtime Environment for Coupled Multi-physics Simulations: Dynamic Resource Allocation and Load-Balancing","authors":"S. Ko, Nayong Kim, Joohyun Kim, A. Thota, S. Jha","doi":"10.1109/CCGRID.2010.107","DOIUrl":"https://doi.org/10.1109/CCGRID.2010.107","url":null,"abstract":"Coupled Multi-Physics simulations, such as hybrid CFD-MD simulations, represent an increasingly important class of scientific applications. Often the physical problems of interest demand the use of high-end computers, such as TeraGrid resources, which are often accessible only via batch-queues. Batch-queue systems are not developed to natively support the coordinated scheduling of jobs – which in turn is required to support the concurrent execution required by coupled multi-physics simulations. In this paper we develop and demonstrate a novel approach to overcome the lack of native support for coordinated job submission requirement associated with coupled runs. We establish the performance advantages arising from our solution, which is a generalization of the Pilot-Job concept – which in of itself is not new, but is being applied to coupled simulations for the first time. Our solution not only overcomes the initial co-scheduling problem, but also provides a dynamic resource allocation mechanism. Support for such dynamic resources is critical for a load balancing mechanism, which we develop and demonstrate to be effective at reducing the total time-to-solution of the problem. We establish that the performance advantage of using Big Jobs is invariant with the size of the machine as well as the size of the physical model under investigation. The Pilot-Job abstraction is developed using SAGA, which provides an infrastructure agnostic implementation, and which can seamlessly execute and utilize distributed resources.","PeriodicalId":444485,"journal":{"name":"2010 10th IEEE/ACM International Conference on Cluster, Cloud and Grid Computing","volume":"43 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2010-05-17","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"124977535","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 33
The Lightweight Approach to Use Grid Services with Grid Widgets on Grid WebOS 在网格WebOS上使用网格服务和网格小部件的轻量级方法
Yi-Lun Pan, Chang-Hsing Wu, Chia-Yen Liu, Hsi-En Yu, Weicheng Huang
{"title":"The Lightweight Approach to Use Grid Services with Grid Widgets on Grid WebOS","authors":"Yi-Lun Pan, Chang-Hsing Wu, Chia-Yen Liu, Hsi-En Yu, Weicheng Huang","doi":"10.1109/CCGRID.2010.25","DOIUrl":"https://doi.org/10.1109/CCGRID.2010.25","url":null,"abstract":"To bridge the gap between computing grid environment and users, various Grid Widgets are developed by the Grid development team in the National Center for High-performance Computing (NCHC). These widgets are implemented to provide users with seamless and scalable access to Grid resources. Currently, this effort integrates the de facto Grid middleware, Web-based Operating System (WebOS), and automatic resource allocation mechanism to form a virtual computer in distributed computing environment. With the capability of automatic resource allocation and the feature of dynamic load prediction, the Resource Broker (RB) improves the performance of the dynamic scheduling over conventional scheduling policies. With this extremely lightweight and flexible approach to acquire Grid services, the barrier for users to access geographically distributed heterogeneous Grid resources is largely reduced. The Grid Widgets can also be customized and configured to meet the demands of the users.","PeriodicalId":444485,"journal":{"name":"2010 10th IEEE/ACM International Conference on Cluster, Cloud and Grid Computing","volume":"20 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2010-05-17","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"121904228","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Experiments with Memory-to-Memory Coupling for End-to-End Fusion Simulation Workflows 面向端到端融合仿真工作流的内存-内存耦合实验
C. Docan, Fan Zhang, M. Parashar, J. Cummings, N. Podhorszki, S. Klasky
{"title":"Experiments with Memory-to-Memory Coupling for End-to-End Fusion Simulation Workflows","authors":"C. Docan, Fan Zhang, M. Parashar, J. Cummings, N. Podhorszki, S. Klasky","doi":"10.1109/CCGRID.2010.101","DOIUrl":"https://doi.org/10.1109/CCGRID.2010.101","url":null,"abstract":"Scientific applications are striving to accurately simulate multiple interacting physical processes that comprise complex phenomena being modeled. Efficient and scalable parallel implementations of these coupled simulations present challenging interaction and coordination requirements, especially when the coupled physical processes are computationally heterogeneous and progress at different speeds. In this paper, we present the design, implementation and evaluation of a memory-to-memory coupling framework for coupled scientific simulations on high-performance parallel computing platforms. The framework is driven by the coupling requirements of the Center for Plasma Edge Simulation, and it provides simple coupling abstractions as well as efficient asynchronous (RDMA-based) memory-to-memory data transport mechanisms that complement existing parallel programming systems and data sharing frameworks. The framework enables flexible coupling behaviors that are asynchronous in time and space, and it supports dynamic coupling between heterogeneous simulation processes without enforcing any synchronization constraints. We evaluate the performance and scalability of the coupling framework using a specific coupling scenario, on the Jaguar Cray XT5 system at Oak Ridge National Laboratory.","PeriodicalId":444485,"journal":{"name":"2010 10th IEEE/ACM International Conference on Cluster, Cloud and Grid Computing","volume":"8 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2010-05-17","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"123916973","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 13
Elastic Site: Using Clouds to Elastically Extend Site Resources 弹性站点:使用云来弹性地扩展站点资源
Paul Marshall, K. Keahey, Timothy Freeman
{"title":"Elastic Site: Using Clouds to Elastically Extend Site Resources","authors":"Paul Marshall, K. Keahey, Timothy Freeman","doi":"10.1109/CCGRID.2010.80","DOIUrl":"https://doi.org/10.1109/CCGRID.2010.80","url":null,"abstract":"Infrastructure-as-a-Service (IaaS) cloud computing offers new possibilities to scientific communities. One of the most significant is the ability to elastically provision and relinquish new resources in response to changes in demand. In our work, we develop a model of an “elastic site” that efficiently adapts services provided within a site, such as batch schedulers, storage archives, or Web services to take advantage of elastically provisioned resources. We describe the system architecture along with the issues involved with elastic provisioning, such as security, privacy, and various logistical considerations. To avoid over- or under-provisioning the resources we propose three different policies to efficiently schedule resource deployment based on demand. We have implemented a resource manager, built on the Nimbus toolkit to dynamically and securely extend existing physical clusters into the cloud. Our elastic site manager interfaces directly with local resource managers, such as Torque. We have developed and evaluated policies for resource provisioning on a Nimbus-based cloud at the University of Chicago, another at Indiana University, and Amazon EC2. We demonstrate a dynamic and responsive elastic cluster, capable of responding effectively to a variety of job submission patterns. We also demonstrate that we can process 10 times faster by expanding our cluster up to 150 EC2 nodes.","PeriodicalId":444485,"journal":{"name":"2010 10th IEEE/ACM International Conference on Cluster, Cloud and Grid Computing","volume":"343 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2010-05-17","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"124313818","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 283
Performance Analysis of Diffusion Tensor Imaging in an Academic Production Grid 学术生产网格中扩散张量成像性能分析
D. Krefting, R. Lützkendorf, Kathrin Peter, J. Bernarding
{"title":"Performance Analysis of Diffusion Tensor Imaging in an Academic Production Grid","authors":"D. Krefting, R. Lützkendorf, Kathrin Peter, J. Bernarding","doi":"10.1109/CCGRID.2010.21","DOIUrl":"https://doi.org/10.1109/CCGRID.2010.21","url":null,"abstract":"Analysis of diffusion weighted magnetic resonance images serves increasingly for non-invasive tracking of nerve fibers in the human brain, both in clinical diagnosis and basic research. Diffusion-tensor imaging (DTI) enables in-vivo research on the internal structure of the central nervous system, an estimation of the interconnection of functional areas and diagnosis of brain tumors and de-myelinating diseases. But modeling the local diffusion parameters is computationally expensive and on standard desktop computers runtimes of up to days are common. A workflow based grid implementation of the algorithm with slice-based parallelization has shown significant speedup. However, in production use, the implementation frequently delayed and even failed, discouraging the medical collaborators to take up the management of the data processing themselves. Therefore a comprehensive analysis of possible sources for errors and delays as well as their real impact in the respective infrastructure is vital to enable clinical researchers to fully exploit the benefits of the Healthgrid application. In this manuscript, we tested different implementations of the DTI analysis with respect to robustness and runtime. Based on the results, concrete application improvements as well as general suggestions for the layout and maintenance of Healthgrids are concluded.","PeriodicalId":444485,"journal":{"name":"2010 10th IEEE/ACM International Conference on Cluster, Cloud and Grid Computing","volume":"12 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2010-05-17","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"122733833","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 7
An Effective Architecture for Automated Appliance Management System Applying Ontology-Based Cloud Discovery 应用基于本体的云发现的自动化设备管理系统的有效架构
A. V. Dastjerdi, Sayed Gholam Hassan Tabatabaei, R. Buyya
{"title":"An Effective Architecture for Automated Appliance Management System Applying Ontology-Based Cloud Discovery","authors":"A. V. Dastjerdi, Sayed Gholam Hassan Tabatabaei, R. Buyya","doi":"10.1109/CCGRID.2010.87","DOIUrl":"https://doi.org/10.1109/CCGRID.2010.87","url":null,"abstract":"Cloud computing is a computing paradigm which allows access of computing elements and storages on-demand over the Internet. Virtual Appliances, pre-configured, ready-to-run applications are emerging as a breakthrough technology to solve the complexities of service deployment on Cloud infrastructure. However, an automated approach to deploy required appliances on the most suitable Cloud infrastructure is neglected by previous works which is the focus of this work. In this paper, we propose an effective architecture using ontology-based discovery to provide QoS aware deployment of appliances on Cloud service providers. In addition, we test our approach on a case study and the result shows the efficiency and effectiveness of the proposed work.","PeriodicalId":444485,"journal":{"name":"2010 10th IEEE/ACM International Conference on Cluster, Cloud and Grid Computing","volume":"1 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2010-05-17","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"116813584","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 112
Selective Recovery from Failures in a Task Parallel Programming Model 任务并行编程模型中的故障选择性恢复
James Dinan, Arjun Singri, P. Sadayappan, S. Krishnamoorthy
{"title":"Selective Recovery from Failures in a Task Parallel Programming Model","authors":"James Dinan, Arjun Singri, P. Sadayappan, S. Krishnamoorthy","doi":"10.1109/CCGRID.2010.34","DOIUrl":"https://doi.org/10.1109/CCGRID.2010.34","url":null,"abstract":"We present a fault tolerant task pool execution environment that is capable of performing fine-grain selective restart using a lightweight, distributed task completion tracking mechanism. Compared with conventional checkpoint/restart techniques, this system offers a recovery penalty that is proportional to the degree of failure rather than the system size. We evaluate this system using the Self Consistent Field (SCF) kernel which forms an important component in ab initio methods for computational chemistry. Experimental results indicate that fault tolerant task pools are robust in the presence of an arbitrary number of failures and that they offer low overhead in the absence of faults.","PeriodicalId":444485,"journal":{"name":"2010 10th IEEE/ACM International Conference on Cluster, Cloud and Grid Computing","volume":"18 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2010-05-17","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"129029704","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 10
Dynamic Load-Balanced Multicast for Data-Intensive Applications on Clouds 云上数据密集型应用的动态负载均衡组播
Tatsuhiro Chiba, M. Burger, T. Kielmann, S. Matsuoka
{"title":"Dynamic Load-Balanced Multicast for Data-Intensive Applications on Clouds","authors":"Tatsuhiro Chiba, M. Burger, T. Kielmann, S. Matsuoka","doi":"10.1109/CCGRID.2010.63","DOIUrl":"https://doi.org/10.1109/CCGRID.2010.63","url":null,"abstract":"Data-intensive parallel applications on clouds need to deploy large data sets from the cloud's storage facility to all compute nodes as fast as possible. Many multicast algorithms have been proposed for clusters and grid environments. The most common approach is to construct one or more spanning trees based on the network topology and network monitoring data in order to maximize available bandwidth and avoid bottleneck links. However, delivering optimal performance becomes difficult once the available bandwidth changes dynamically. In this paper, we focus on Amazon EC2/S3 (the most commonly used cloud platform today) and propose two high performance multicast algorithms. These algorithms make it possible to efficiently transfer large amounts of data stored in Amazon S3 to multiple Amazon EC2 nodes. The three salient features of our algorithms are (1) to construct an overlay network on clouds without network topology information, (2) to optimize the total throughput dynamically, and (3) to increase the download throughput by letting nodes cooperate with each other. The two algorithms differ in the way nodes cooperate: the first `non-steal' algorithm lets each node download an equal share of all data, while the second `steal' algorithm uses work stealing to counter the effect of heterogeneous download bandwidth. As a result, all nodes can download files from S3 quickly, even when the network performance changes while the algorithm is running. We evaluate our algorithms on EC2/S3, and show that they are scalable and consistently achieve high throughput. Both algorithms perform much better than having each node downloading all data directly from S3.","PeriodicalId":444485,"journal":{"name":"2010 10th IEEE/ACM International Conference on Cluster, Cloud and Grid Computing","volume":"6 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2010-05-17","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"129072012","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 31
Cluster Computing as an Assembly Process: Coordination with S-Net 作为装配过程的集群计算:与S-Net的协调
C. Grelck, Jukka Julku, F. Penczek, A. Shafarenko
{"title":"Cluster Computing as an Assembly Process: Coordination with S-Net","authors":"C. Grelck, Jukka Julku, F. Penczek, A. Shafarenko","doi":"10.1109/CCGRID.2010.103","DOIUrl":"https://doi.org/10.1109/CCGRID.2010.103","url":null,"abstract":"This poster will present a coordination language for distributed computing and will discuss its application to cluster computing. It will introduce a programming technique of cluster computing whereby application components are completely dissociated from the communication/coordination infrastructure (unlike MPI-style message passing), and there is no shared memory either, whether virtual or physical (unlike Open-MP). Cluster computing is thus presented as something that happens as late as the assembly stage: components are integrated into an application using a new form of network glue: Single-Input, Single-Output (SISO) asynchronous, no deterministic coordination.","PeriodicalId":444485,"journal":{"name":"2010 10th IEEE/ACM International Conference on Cluster, Cloud and Grid Computing","volume":"13 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2010-05-17","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"121800839","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Multi-FFT Vectorization for the Cell Multicore Processor Cell多核处理器的多fft矢量化
J. Barhen, T. Humble, P. Mitra, M. Traweek
{"title":"Multi-FFT Vectorization for the Cell Multicore Processor","authors":"J. Barhen, T. Humble, P. Mitra, M. Traweek","doi":"10.1109/CCGRID.2010.78","DOIUrl":"https://doi.org/10.1109/CCGRID.2010.78","url":null,"abstract":"The emergence of streaming multicore processors with multi-SIMD architectures and ultra-low power operation combined with real-time compute and I/O reconfigurability opens unprecedented opportunities for executing sophisticated signal processing algorithms faster and within a much lower energy budget. Here, we present an unconventional FFT implementation scheme for the IBM Cell, named transverse vectorization. It is shown to outperform (both in terms of timing or GFLOP throughput) the fastest FFT results reported to date in the open literature.","PeriodicalId":444485,"journal":{"name":"2010 10th IEEE/ACM International Conference on Cluster, Cloud and Grid Computing","volume":"9 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2010-05-17","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"122303248","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 3
0
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
确定
请完成安全验证×
相关产品
×
本文献相关产品
联系我们:info@booksci.cn Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。 Copyright © 2023 布克学术 All rights reserved.
京ICP备2023020795号-1
ghs 京公网安备 11010802042870号
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术官方微信