2011 11th IEEE/ACM International Symposium on Cluster, Cloud and Grid Computing最新文献_第3页

Network-Friendly One-Sided Communication through Multinode Cooperation on Petascale Cray XT5 Systems 千万亿Cray XT5系统上多节点合作的网络友好单侧通信

2011 11th IEEE/ACM International Symposium on Cluster, Cloud and Grid Computing Pub Date : 2011-05-23 DOI: 10.1109/CCGrid.2011.62

Xinyu Que, Weikuan Yu, V. Tipparaju, J. Vetter, Bin Wang

{"title":"Network-Friendly One-Sided Communication through Multinode Cooperation on Petascale Cray XT5 Systems","authors":"Xinyu Que, Weikuan Yu, V. Tipparaju, J. Vetter, Bin Wang","doi":"10.1109/CCGrid.2011.62","DOIUrl":"https://doi.org/10.1109/CCGrid.2011.62","url":null,"abstract":"One-sided communication is important to enable asynchronous communication and data movement for Global Address Space (GAS) programming models. Such communication is typically realized through direct messages between initiator and target processes. For peta scale systems with 10,000s of nodes and 100,000s of cores, these direct messages require dedicated communication buffers and/or channels, which can lead to significant scalability challenges for GAS programming models. In this paper, we describe a network-friendly communication model, multinode cooperation, to enable indirect one-sided communication. Compute nodes work together to handle one-side requests through (1) request forwarding in which one node can intercept a request and forward it to a target node, and (2) request aggregation in which one node can aggregate many requests to a target node. We have implemented multinode cooperation for a popular GAS runtime library, Aggregate Remote Memory Copy Interface (ARMCI). Our experimental results on a large scale Cray XT5 system demonstrate that multinode cooperationis able to greatly increase memory scalability by reducing communication buffers required on each node. In addition, multinode cooperation improves the resiliency of GAS runtime system to network contention. Furthermore, multinode cooperation can benefit the performance of scientific applications. In one case, it reduces the total execution time of an NWChem application by 52%.","PeriodicalId":376385,"journal":{"name":"2011 11th IEEE/ACM International Symposium on Cluster, Cloud and Grid Computing","volume":"42 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2011-05-23","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"133647278","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 1

Towards Real-Time, Volunteer Distributed Computing 走向实时、自愿的分布式计算

2011 11th IEEE/ACM International Symposium on Cluster, Cloud and Grid Computing Pub Date : 2011-05-23 DOI: 10.1109/CCGrid.2011.54

Sangho Yi, E. Jeannot, Derrick Kondo, David P. Anderson

引用次数: 20

GPGPU-Accelerated Parallel and Fast Simulation of Thousand-Core Platforms gpgpu加速的千核平台并行与快速仿真

2011 11th IEEE/ACM International Symposium on Cluster, Cloud and Grid Computing Pub Date : 2011-05-23 DOI: 10.1109/CCGRID.2011.64

Christian Pinto, Shivani Raghav, A. Marongiu, M. Ruggiero, David Atienza Alonso, Luca Benini

{"title":"GPGPU-Accelerated Parallel and Fast Simulation of Thousand-Core Platforms","authors":"Christian Pinto, Shivani Raghav, A. Marongiu, M. Ruggiero, David Atienza Alonso, Luca Benini","doi":"10.1109/CCGRID.2011.64","DOIUrl":"https://doi.org/10.1109/CCGRID.2011.64","url":null,"abstract":"The multicore revolution and the ever-increasing complexity of computing systems is dramatically changing sys-tem design, analysis and programming of computing platforms. Future architectures will feature hundreds to thousands of simple processors and on-chip memories connected through a network-on-chip. Architectural simulators will remain primary tools for design space exploration, software development and performance evaluation of these massively parallel architectures. However, architectural simulation performance is a serious concern, as virtual platforms and simulation technology are not able to tackle the complexity of thousands of core future scenarios. The main contribution of this paper is the development of a new simulation approach and technology for many core processors which exploit the enormous parallel processing capability of low-cost and widely available General Purpose Graphic Processing Units (GPGPU). The simulation of many-core architectures exhibits indeed a high level of parallelism and is inherently parallelizable, but GPGPU acceleration of architectural simulation requires an in-depth revision of the data structures and functional partitioning traditionally used in parallel simulation. We demonstrate our GPGPU simulator on a target architecture composed by several cores (i.e. ARM ISA based), with instruction and data caches, connected through a Network-on-Chip (NoC). Our experiments confirm the feasibility of our approach.","PeriodicalId":376385,"journal":{"name":"2011 11th IEEE/ACM International Symposium on Cluster, Cloud and Grid Computing","volume":"248 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2011-05-23","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"114941647","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 21

Diagnosing Anomalous Network Performance with Confidence 有信心地诊断异常网络性能

2011 11th IEEE/ACM International Symposium on Cluster, Cloud and Grid Computing Pub Date : 2011-05-23 DOI: 10.1109/CCGrid.2011.80

B. Settlemyer, S. Hodson, J. Kuehn, S. Poole

引用次数: 1

Unifying Cloud Management: Towards Overall Governance of Business Level Objectives 统一云管理:实现业务级目标的全面治理

2011 11th IEEE/ACM International Symposium on Cluster, Cloud and Grid Computing Pub Date : 2011-05-23 DOI: 10.1109/CCGrid.2011.65

M. Sedaghat, F. Hernández-Rodriguez, E. Elmroth

{"title":"Unifying Cloud Management: Towards Overall Governance of Business Level Objectives","authors":"M. Sedaghat, F. Hernández-Rodriguez, E. Elmroth","doi":"10.1109/CCGrid.2011.65","DOIUrl":"https://doi.org/10.1109/CCGrid.2011.65","url":null,"abstract":"We address the challenge of providing unified cloud resource management towards an overall business level objective, given the multitude of managerial tasks to be performed and the complexity of any architecture to support them. Resource level management tasks include elasticity control, virtual machine and data placement, autonomous fault management, etc, which are intrinsically difficult problems since services normally have unknown lifetime and capacity demands that varies largely over time. To unify the management of these problems, (for optimization with respect to some higher level business level objective, like optimizing revenue while breaking no more than a certain percentage of service level agreements)becomes even more challenging as the resource level managerial challenges are far from independent. After providing the general problem formulation, we review recent approaches taken by the research community, including mainly general autonomic computing technology for large-scale environments and resource level management tools equipped with some business oriented or otherwise qualitative features. We propose and illustrate a policy-driven approach where a high-level management system monitors overall system and services behavior and adjusts lower level policies (e.g., thresholds for admission control, elasticity control, server consolidation level, etc) for optimization towards the measurable business level objectives.","PeriodicalId":376385,"journal":{"name":"2011 11th IEEE/ACM International Symposium on Cluster, Cloud and Grid Computing","volume":"33 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2011-05-23","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"116058940","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 44

Ex-MATE: Data Intensive Computing with Large Reduction Objects and Its Application to Graph Mining Ex-MATE:大约简对象的数据密集计算及其在图挖掘中的应用

2011 11th IEEE/ACM International Symposium on Cluster, Cloud and Grid Computing Pub Date : 2011-05-23 DOI: 10.1109/CCGrid.2011.18

Wei Jiang, G. Agrawal

{"title":"Ex-MATE: Data Intensive Computing with Large Reduction Objects and Its Application to Graph Mining","authors":"Wei Jiang, G. Agrawal","doi":"10.1109/CCGrid.2011.18","DOIUrl":"https://doi.org/10.1109/CCGrid.2011.18","url":null,"abstract":"Map-reduce framework has been widely used as the infrastructure for processing large-scale datasets in various domains. Recent work has shown that an alternate API MATE(Mapreduce with an Alternate API), where a reduction object is explicitly maintained and updated, reduces memory requirements and can significantly improve performance for many applications. However, unlike the original API, support for the alternate API has been restricted to the cases where the reduction object can fit in the memory. This limits the applicability of the MATE approach. Particularly, one emerging class of applications that require support for large reduction objects are the graph mining applications. This paper describes a system, Extended MATE or Ex-MATE, which supports this alternate API with reduction objects of arbitrary sizes. We develop support for managing disk-resident reduction objects and updating them efficiently. We evaluate our system using three graph mining applications and compare its performance to that of PEGASUS, a graph mining system implemented based on the original map-reduce API and its Hadoop implementation. Our results on a cluster with 128 cores show that for all three applications, our system outperforms PEGASUS, by factors ranging between 9 and 35.","PeriodicalId":376385,"journal":{"name":"2011 11th IEEE/ACM International Symposium on Cluster, Cloud and Grid Computing","volume":"40 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2011-05-23","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"116593103","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 22

GeoServ: A Distributed Urban Sensing Platform GeoServ:分布式城市传感平台

2011 11th IEEE/ACM International Symposium on Cluster, Cloud and Grid Computing Pub Date : 2011-05-23 DOI: 10.1109/CCGrid.2011.10

Jong Hoon Ahnn, Uichin Lee, H. J. Moon

{"title":"GeoServ: A Distributed Urban Sensing Platform","authors":"Jong Hoon Ahnn, Uichin Lee, H. J. Moon","doi":"10.1109/CCGrid.2011.10","DOIUrl":"https://doi.org/10.1109/CCGrid.2011.10","url":null,"abstract":"Urban sensing where mobile users continuously gather, process, and share location-sensitive sensor data (e.g., street images, road condition, traffic flow) is emerging as a new network paradigm of sensor information sharing in urban environments. The key enablers are the smart phones (e.g., iPhones and Android phones) equipped with onboard sensors (e.g., cameras, accelerometer, compass, GPS), and various wireless devices (e.g., WiFi and 2/3G). The goal of this paper is to design a scalable sensor networking platform where millions of users on the move can participate in urban sensing and share location-aware information using always-on cellular data connections. We propose a two-tier sensor networking platform called GeoServ where mobile users publish/access sensor data via an Internet-based distributed P2P overlay network. The main contribution of this paper is two-fold: a location-aware sensor data retrieval scheme that supports geographic range queries, and a location-aware publish-subscribe scheme that enables efficient multicast routing over a group of subscribed users. We prove that GeoServ protocols preserve locality and validate their performance via extensive simulations.","PeriodicalId":376385,"journal":{"name":"2011 11th IEEE/ACM International Symposium on Cluster, Cloud and Grid Computing","volume":"28 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2011-05-23","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"123505070","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 30

Social Networks of Researchers and Educators on nanoHUB.org nanoHUB.org上的研究人员和教育工作者的社会网络

2011 11th IEEE/ACM International Symposium on Cluster, Cloud and Grid Computing Pub Date : 2011-05-23 DOI: 10.1109/CCGRID.2011.33

Gerhard Klimeck, G. Adams, K. Madhavan, Nathan Denny, M. Zentner, Swaroop Shivarajapura, L. Zentner, D. Beaudoin

{"title":"Social Networks of Researchers and Educators on nanoHUB.org","authors":"Gerhard Klimeck, G. Adams, K. Madhavan, Nathan Denny, M. Zentner, Swaroop Shivarajapura, L. Zentner, D. Beaudoin","doi":"10.1109/CCGRID.2011.33","DOIUrl":"https://doi.org/10.1109/CCGRID.2011.33","url":null,"abstract":"The science gateway nanoHUB.org is the world's largest nanotechnology user facility, serving 167, 196 users in 2010 with over 2,300 resources including 189 simulation programs. Surveys of nanoHUB users and automated usage analysis find widespread simulation use in formal classroom education, thereby connecting recent research more rapidly and closely to education. Analysis of 719 citations in the scientific literature by over 1,300 authors to nanoHUB.org resources documents use of simulation programs by new research collaborations, by researchers outside of the community originating the program, and by experimentalists. The publication and author networks reveal research collaborations and capacity building through knowledge transfer. Analysis of secondary citations documents the quality of the conducted research with an h-index of 30 after just 10 years of operation. Our analysis proves with quantitative metrics that impactful research can be conducted by an ever growing research community. We argue that HUBzeroTM technology and the user-focused design and operation of nanoHUB.org are keys to success that can be transferred to other science gateways.","PeriodicalId":376385,"journal":{"name":"2011 11th IEEE/ACM International Symposium on Cluster, Cloud and Grid Computing","volume":"26 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2011-05-23","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"122725192","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 7

MPI-IO/Gfarm: An Optimized Implementation of MPI-IO for the Gfarm File System MPI-IO/Gfarm: Gfarm文件系统MPI-IO的优化实现

2011 11th IEEE/ACM International Symposium on Cluster, Cloud and Grid Computing Pub Date : 2011-05-23 DOI: 10.1109/CCGrid.2011.82

Hiroki Kimura, O. Tatebe

引用次数: 2

Predictive Data Grouping and Placement for Cloud-Based Elastic Server Infrastructures 基于云的弹性服务器基础设施的预测性数据分组和放置

2011 11th IEEE/ACM International Symposium on Cluster, Cloud and Grid Computing Pub Date : 2011-05-23 DOI: 10.1109/CCGrid.2011.49

Juan M. Tirado, Daniel Higuero, Florin Isaila, J. Carretero

{"title":"Predictive Data Grouping and Placement for Cloud-Based Elastic Server Infrastructures","authors":"Juan M. Tirado, Daniel Higuero, Florin Isaila, J. Carretero","doi":"10.1109/CCGrid.2011.49","DOIUrl":"https://doi.org/10.1109/CCGrid.2011.49","url":null,"abstract":"Workload variations on Internet platforms such as YouTube, Flickr, LastFM require novel approaches to dynamic resource provisioning in order to meet QoS requirements, while reducing the Total Cost of Ownership (TCO) of the infrastructures. The economy of scale promise of cloud computing is a great opportunity to approach this problem, by developing elastic large scale server infrastructures. However, a proactive approach to dynamic resource provisioning requires prediction models forecasting future load patterns. On the other hand, unexpected volume and data spikes require reactive provisioning for serving unexpected surges in workloads. When workload can not be predicted, adequate data grouping and placement algorithms may facilitate agile scaling up and down of an infrastructure. In this paper, we analyze a dynamic workload of an on-line music portal and present an elastic Web infrastructure that adapts to workload variations by dynamically scaling up and down servers. The workload is predicted by an autoregressive model capturing trends and seasonal patterns. Further, for enhancing data locality, we propose a predictive data grouping based on the history of content access of a user community. Finally, in order to facilitate agile elasticity, we present a data placement based on workload and access pattern prediction. The experimental results demonstrate that our forecasting model predicts workload with a high precision. Further, the predictive data grouping and placement methods provide high locality, load balance and high utilization of resources, allowing a server infrastructure to scale up and down depending on workload.","PeriodicalId":376385,"journal":{"name":"2011 11th IEEE/ACM International Symposium on Cluster, Cloud and Grid Computing","volume":"15 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2011-05-23","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"129651146","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 49