2015 IEEE International Conference on Cluster Computing最新文献

筛选
英文 中文
A Workload-Aware Energy Model for Virtual Machine Migration 虚拟机迁移的工作负载感知能量模型
2015 IEEE International Conference on Cluster Computing Pub Date : 2015-09-08 DOI: 10.1109/CLUSTER.2015.47
Vincenzo De Maio, G. Kecskeméti, R. Prodan
{"title":"A Workload-Aware Energy Model for Virtual Machine Migration","authors":"Vincenzo De Maio, G. Kecskeméti, R. Prodan","doi":"10.1109/CLUSTER.2015.47","DOIUrl":"https://doi.org/10.1109/CLUSTER.2015.47","url":null,"abstract":"Energy consumption has become a significant issue for data centres. Assessing their consumption requires precise and detailed models. In the latter years, many models have been proposed, but most of them either do not consider energy consumption related to virtual machine migration or do not consider the variation of the workload on (1) the virtual machines (VM) and (2) the physical machines hosting the VMs. In this paper, we show that omitting migration and workload variation from the models could lead to misleading consumption estimates. Then, we propose a new model for data centre energy consumption that takes into account the previously omitted model parameters and provides accurate energy consumption predictions for paravirtualised virtual machines running on homogeneous hosts. The new model's accuracy is evaluated with a comprehensive set of operational scenarios. With the use of these scenarios we present a comparative analysis of our model with similar state-of-the-art models for energy consumption of VM Migration, showing an improvement up to 24% in accuracy of prediction.","PeriodicalId":187042,"journal":{"name":"2015 IEEE International Conference on Cluster Computing","volume":"429 14","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2015-09-08","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"113998526","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 21
Development of MapReduce and MPI Programs for Motif Search Motif搜索MapReduce和MPI程序的开发
2015 IEEE International Conference on Cluster Computing Pub Date : 2015-09-08 DOI: 10.1109/CLUSTER.2015.82
Mejdl S. Safran, Saad Al-qahtani, Michelle Zhu, D. Che
{"title":"Development of MapReduce and MPI Programs for Motif Search","authors":"Mejdl S. Safran, Saad Al-qahtani, Michelle Zhu, D. Che","doi":"10.1109/CLUSTER.2015.82","DOIUrl":"https://doi.org/10.1109/CLUSTER.2015.82","url":null,"abstract":"As one of the important problems in molecular biology, motif search is computationally expensive, especially when the size of DNA sequences is large. Extended from a graduate course project in parallel and distributed computing (PDC), this paper investigates two different programming frameworks, namely MapReduce and MPI on motif finding. We implemented a serial algorithm, a MapReduce based algorithm, and a MPI program to calculate the best motif in given DNA sequences. The experimental results demonstrate that our MPI program outperformed both the MapReduce-based algorithm and the serial program with superior efficiency.","PeriodicalId":187042,"journal":{"name":"2015 IEEE International Conference on Cluster Computing","volume":"10 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2015-09-08","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"133258621","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
I/O-Aware Batch Scheduling for Petascale Computing Systems 千兆级计算系统的I/ o感知批调度
2015 IEEE International Conference on Cluster Computing Pub Date : 2015-09-08 DOI: 10.1109/CLUSTER.2015.45
Zhou Zhou, Xu Yang, Dongfang Zhao, P. Rich, Wei Tang, Jia Wang, Z. Lan
{"title":"I/O-Aware Batch Scheduling for Petascale Computing Systems","authors":"Zhou Zhou, Xu Yang, Dongfang Zhao, P. Rich, Wei Tang, Jia Wang, Z. Lan","doi":"10.1109/CLUSTER.2015.45","DOIUrl":"https://doi.org/10.1109/CLUSTER.2015.45","url":null,"abstract":"In the Big Data era, the gap between the storage performance and an application's I/O requirement is increasing. I/O congestion caused by concurrent storage accesses from multiple applications is inevitable and severely harms the performance. Conventional approaches either focus on optimizing an application's access pattern individually or handle I/O requests on a low-level storage layer without any knowledge from the upper-level applications. In this paper, we present a novel I/O-aware batch scheduling framework to coordinate ongoing I/O requests on petascale computing systems. The motivation behind this innovation is that the batch scheduler has a holistic view of both the system state and jobs' activities and can control the jobs' status on the fly during their execution. We treat a job's I/O requests as periodical subjobs within its lifecycle and transform the I/O congestion issue into a classical scheduling problem. We design two scheduling polices with different scheduling objectives either on user-oriented metrics or system performance. We conduct extensive trace-based simulations using real job traces and I/O traces from a production IBM Blue Gene/Q system. Experimental results demonstrate that our design can improve job performance by more than 30%, as well as increasing system performance.","PeriodicalId":187042,"journal":{"name":"2015 IEEE International Conference on Cluster Computing","volume":"16 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2015-09-08","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"133751156","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 43
Building Bridges from the Campus to XSEDE 搭建从校园到XSEDE的桥梁
2015 IEEE International Conference on Cluster Computing Pub Date : 2015-09-08 DOI: 10.1109/CLUSTER.2015.144
Liming Lee, Ian T Foster, S. Tuecke
{"title":"Building Bridges from the Campus to XSEDE","authors":"Liming Lee, Ian T Foster, S. Tuecke","doi":"10.1109/CLUSTER.2015.144","DOIUrl":"https://doi.org/10.1109/CLUSTER.2015.144","url":null,"abstract":"XSEDE is the integration framework for national-scale, public HPC resources in the United States. XSEDE is used by thousands of researchers at hundreds of college and university campuses throughout the country, as well as many international collaborators. Over the past several program years, XSEDE has redefined its identity management, security, and service interfaces to bridge the gap between national-scale HPC resources and campus-based computing resources. These changes make it easier for research performed on campus to access our national computing resources and make them a part of the everyday research process. We report here on XSEDE's new identity management system and how it provides a smooth bridge between campus and national identity systems. We also describe how this federated security system supports two additional bridges between campuses and national HPC services, one involving data movement and another involving scientific workflows.","PeriodicalId":187042,"journal":{"name":"2015 IEEE International Conference on Cluster Computing","volume":"32 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2015-09-08","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"115721337","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 1
monBench: A Database Performance Benchmark for Cloud Monitoring System monBench:用于云监控系统的数据库性能基准
2015 IEEE International Conference on Cluster Computing Pub Date : 2015-09-08 DOI: 10.1109/CLUSTER.2015.94
Xinkui Zhao, Jianwei Yin, Chen Zhi, Pengxiang Lin, Shichun Feng, Hao Wu, Zuoning Chen
{"title":"monBench: A Database Performance Benchmark for Cloud Monitoring System","authors":"Xinkui Zhao, Jianwei Yin, Chen Zhi, Pengxiang Lin, Shichun Feng, Hao Wu, Zuoning Chen","doi":"10.1109/CLUSTER.2015.94","DOIUrl":"https://doi.org/10.1109/CLUSTER.2015.94","url":null,"abstract":"Monitoring system provides a clear insight into the state and performance of service components in cloud computing platforms. It collects metrics from dispersed sensors and stores them in databases for show and future query. To choose the most suitable database for a monitoring system is complex, since the performance requirement on cloud monitoring system in different data centers differs widely. In this paper, we propose a benchmark named monBench to evaluate the performance of databases in cloud monitoring systems. Monbench extracts structural data from real-world monitoring logs to construct the benchmarking workload. Several performance metrics, such as throughput and query time, are evaluated by monBench.","PeriodicalId":187042,"journal":{"name":"2015 IEEE International Conference on Cluster Computing","volume":"55 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2015-09-08","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"123223610","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 1
Detecting and Correcting Data Corruption in Stencil Applications through Multivariate Interpolation 通过多元插值检测和纠正模板应用中的数据损坏
2015 IEEE International Conference on Cluster Computing Pub Date : 2015-09-08 DOI: 10.1109/CLUSTER.2015.108
L. Bautista-Gomez, F. Cappello
{"title":"Detecting and Correcting Data Corruption in Stencil Applications through Multivariate Interpolation","authors":"L. Bautista-Gomez, F. Cappello","doi":"10.1109/CLUSTER.2015.108","DOIUrl":"https://doi.org/10.1109/CLUSTER.2015.108","url":null,"abstract":"High-performance computing is a powerful tool that allows scientists to study complex natural phenomena. Extreme-scale supercomputers promise orders of magnitude higher performance compared with that of current systems. However, power constrains in future exascale systems might limit the level of resilience of those machines. In particular, data could get corrupted silently, that is, without the hardware detecting the corruption. This situation is clearly unacceptable: simulation results must be within the error margin specified by the user. In this paper, we exploit multivariate interpolation in order to detect and correct data corruption in stencil applications. We evaluate this technique with a turbulent fluid application, and we demonstrate that the prediction error using multivariate interpolation is on the order of 0.01. Our results show that this mechanism can detect and correct most important corruptions and keep the error deviation under 1% during the entire execution while injecting one corruption per minute. In addition, we stress test the detector by injecting more than ten corruptions per minute and observe that our strategy allows the application to produce results with an error deviation under 10% in such a stressful scenario.","PeriodicalId":187042,"journal":{"name":"2015 IEEE International Conference on Cluster Computing","volume":"1947 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2015-09-08","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"124912132","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 43
Analysis of XDMoD/SUPReMM Data Using Machine Learning Techniques 使用机器学习技术分析XDMoD/SUPReMM数据
2015 IEEE International Conference on Cluster Computing Pub Date : 2015-09-08 DOI: 10.1109/CLUSTER.2015.114
S. Gallo, Joseph P. White, R. L. Deleon, T. Furlani, Helen Ngo, A. Patra, Matthew D. Jones, Jeffrey T. Palmer, N. Simakov, Jeanette M. Sperhac, Martins D. Innus, Thomas Yearke, Ryan Rathsam
{"title":"Analysis of XDMoD/SUPReMM Data Using Machine Learning Techniques","authors":"S. Gallo, Joseph P. White, R. L. Deleon, T. Furlani, Helen Ngo, A. Patra, Matthew D. Jones, Jeffrey T. Palmer, N. Simakov, Jeanette M. Sperhac, Martins D. Innus, Thomas Yearke, Ryan Rathsam","doi":"10.1109/CLUSTER.2015.114","DOIUrl":"https://doi.org/10.1109/CLUSTER.2015.114","url":null,"abstract":"Machine learning techniques were applied to job accounting and performance data for application classification. Job data were accumulated using the XDMoD monitoring technology named SUPReMM, they consist of job accounting information, application information from Lariat/XALT, and job performance data from TACC_Stats. The results clearly demonstrate that community applications have characteristic signatures which can be exploited for job classification. We conclude that machine learning can assist in classifying jobs of unknown application, in characterizing the job mixture, and in harnessing the variation in node and time dependence for further analysis.","PeriodicalId":187042,"journal":{"name":"2015 IEEE International Conference on Cluster Computing","volume":"82 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2015-09-08","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"126134802","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 11
Taming Non-local Stragglers Using Efficient Prefetching in MapReduce MapReduce中使用高效预取控制非本地掉队者
2015 IEEE International Conference on Cluster Computing Pub Date : 2015-09-08 DOI: 10.1109/CLUSTER.2015.16
Ze Yu, Min Li, Xin Yang, Han Zhao, Xiaolin Li
{"title":"Taming Non-local Stragglers Using Efficient Prefetching in MapReduce","authors":"Ze Yu, Min Li, Xin Yang, Han Zhao, Xiaolin Li","doi":"10.1109/CLUSTER.2015.16","DOIUrl":"https://doi.org/10.1109/CLUSTER.2015.16","url":null,"abstract":"MapReduce has been widely adopted as a programming model to process big data. However, parallel jobs in MapReduce are prone to be plagued by stragglers caused by non-local tasks for two reasons: first, system logs from production clusters show that a non-local task can be two times slower than a local task; second, a job's completion time is bottlenecked by its slowest parallel tasks. As a result, even one single non-local task can become the straggler of the whole job, causing significant delay of the whole job. In this paper, we propose to alleviate this problem by proactively prefetching input data for non-local tasks. However, performing such prefetching efficiently in MapReduce is difficult, because it requires both application-level information to generate accurate prefetching requests at runtime, and an appropriate network flow scheduling mechanism to guarantee the timeliness of prefetching flows. To address these challenges, we design and implement FlexFetch, which 1) leverages a novel mechanism called speculative scheduling to accurately generate prefetching flows, 2) explicitly allocates network resources to prefetching flows using a criticality-aware deadline-driven flow scheduling algorithm. We evaluate FlexFetch through both testbed experiments and large-scale simulations using production workloads. The results show that FlexFetch reduces the completion time by 41.8% for small jobs and 26.8% on average, compared with the default MapReduce implementation in Hadoop.","PeriodicalId":187042,"journal":{"name":"2015 IEEE International Conference on Cluster Computing","volume":"33 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2015-09-08","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"124642289","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 7
Modeling a Large Data-Acquisition Network in a Simulation Framework 基于仿真框架的大型数据采集网络建模
2015 IEEE International Conference on Cluster Computing Pub Date : 2015-09-08 DOI: 10.1109/CLUSTER.2015.137
T. Colombo, H. Fröning, P. García, W. Vandelli
{"title":"Modeling a Large Data-Acquisition Network in a Simulation Framework","authors":"T. Colombo, H. Fröning, P. García, W. Vandelli","doi":"10.1109/CLUSTER.2015.137","DOIUrl":"https://doi.org/10.1109/CLUSTER.2015.137","url":null,"abstract":"The ATLAS detector at CERN records particle collision \"events\" delivered by the Large Hadron Collider. Its data-acquisition system identifies, selects, and stores interesting events in near real-time, with an aggregate throughput of several 10 GB/s. It is a distributed software system executed on a farm of roughly 2000 commodity worker nodes communicating via TCP/IP on an Ethernet network. Event data fragments are received from the many detector readout channels and are buffered, collected together, analyzed and either stored permanently or discarded. This system, and data-acquisition systems in general, are sensitive to the latency of the data transfer from the readout buffers to the worker nodes. Challenges affecting this transfer include the many-to-one communication pattern and the inherently bursty nature of the traffic. In this paper we introduce the main performance issues brought about by this workload, focusing in particular on the so-called TCP incast pathology. Since performing systematic studies of these issues is often impeded by operational constraints related to the mission-critical nature of these systems, we focus instead on the development of a simulation model of the ATLAS data-acquisition system, used as a case study. The simulation is based on the well-established OMNeT++ framework. Its results are compared with existing measurements of the system's behavior. The successful reproduction of the measurements by the simulations validates the modeling approach. We share some of the preliminary findings obtained from the simulation, as an example of the additional possibilities it enables, and outline the planned future investigations.","PeriodicalId":187042,"journal":{"name":"2015 IEEE International Conference on Cluster Computing","volume":"173 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2015-09-08","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"115997240","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 4
Fast Fault Injection and Sensitivity Analysis for Collective Communications 集体通信快速故障注入与灵敏度分析
2015 IEEE International Conference on Cluster Computing Pub Date : 2015-09-08 DOI: 10.1109/CLUSTER.2015.31
Kun Feng, Manjunath Gorentla Venkata, Dong Li, Xian-He Sun
{"title":"Fast Fault Injection and Sensitivity Analysis for Collective Communications","authors":"Kun Feng, Manjunath Gorentla Venkata, Dong Li, Xian-He Sun","doi":"10.1109/CLUSTER.2015.31","DOIUrl":"https://doi.org/10.1109/CLUSTER.2015.31","url":null,"abstract":"The collective communication operations, which are widely used in parallel applications for global communication and synchronization are critical for application's performance and scalability. However, how faulty collective communications impact the application and how errors propagate between the application processes is largely unexplored. One of the critical reasons for this situation is the lack of fast evaluation method to investigate the impacts of faulty collective operations. The traditional random fault injection methods relying on a large amount of fault injection tests to ensure statistical significance require a significant amount of resources and time. These methods result in prohibitive evaluation cost when applied to the collectives. In this paper, we introduce a novel tool named Fast Fault Injection and Sensitivity Analysis Tool (FastFIT) to conduct fast fault injection and characterize the application sensitivity to faulty collectives. The tool achieves fast exploration by reducing the exploration space and predicting the application sensitivity using Machine Learning (ML) techniques. A basis for these techniques are implicit correlations between MPI semantics, application context, critical application features, and application responses to faulty collective communications. The experimental results show that our approach reduces the fault injection points and tests by 97% for representative benchmarks (NAS Parallel Benchmarks (NPB)) and a realistic application (Large-scale Atomic/Molecular Massively Parallel Simulator (LAMMPS)) on a production supercomputer. Further, we statistically generalize the application sensitivity to faulty collective communications for these workloads, and present correlation between application features and the sensitivity.","PeriodicalId":187042,"journal":{"name":"2015 IEEE International Conference on Cluster Computing","volume":"223 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2015-09-08","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"116297197","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 3
0
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
确定
请完成安全验证×
相关产品
×
本文献相关产品
联系我们:info@booksci.cn Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。 Copyright © 2023 布克学术 All rights reserved.
京ICP备2023020795号-1
ghs 京公网安备 11010802042870号
Book学术文献互助
Book学术文献互助群
群 号:604180095
Book学术官方微信