2012 IEEE 26th International Parallel and Distributed Processing Symposium Workshops & PhD Forum最新文献

Designing Network Failover and Recovery in MPI for Multi-Rail InfiniBand Clusters 多轨ib集群MPI网络故障转移与恢复设计

2012 IEEE 26th International Parallel and Distributed Processing Symposium Workshops & PhD Forum Pub Date : 2012-05-21 DOI: 10.1109/IPDPSW.2012.142

S. Raikar, H. Subramoni, K. Kandalla, Jérôme Vienne, D. Panda

{"title":"Designing Network Failover and Recovery in MPI for Multi-Rail InfiniBand Clusters","authors":"S. Raikar, H. Subramoni, K. Kandalla, Jérôme Vienne, D. Panda","doi":"10.1109/IPDPSW.2012.142","DOIUrl":"https://doi.org/10.1109/IPDPSW.2012.142","url":null,"abstract":"The emerging trends of designing commodity-based supercomputing systems have a severe detrimental impact on the Mean-Time-Between-Failures (MTBF). The MTBF for typical HEC installations is currently estimated to be between eight hours and fifteen days [1]. Failures in the interconnect fabric account for a fair share of the total failures occurring in such systems. This will continue to degrade as system sizes become larger. Thus, it is highly desirable that next generation system architectures and software environments provide sophisticated network level fault-tolerance and fault-resilient solutions. In the past few years, the number of cores on processors has increased dramatically. To make efficient use of these machines it is necessary to provide the required bandwidth to all the cores. To keep up with the multi-core trend, current generation supercomputers and clusters are designed with multiple network cards (rails) to provide enhanced data transfer capabilities. Besides providing enhanced performance, such multi-rail networks can also be leveraged to provide network level fault resilience. This paper presents a design for a failover mechanism in a multi-rail scenario, for handling network failures and their recovery without compromising on performance. In a general message passing scenario, whenever there is a network failure, the entire job aborts. Our design allows the job to continue even when a network failure occurs, by using the remaining rails for communication. Once the rail recovers from the failure, we also propose a protocol to re-establish connections on that rail and resume normal operations. We experimentally demonstrate that our implementation adds very little overhead and is able to deliver good performance which is comparable to that of the other rails running in isolation. We also show that the recovery is immediate and is associated with no additional overhead. We also depict sustenance and reliability of the design by running application benchmarks with permanent failures.","PeriodicalId":378335,"journal":{"name":"2012 IEEE 26th International Parallel and Distributed Processing Symposium Workshops & PhD Forum","volume":"34 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2012-05-21","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"116993488","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 3

Business Process Oriented Platform-as-a-Service Framework for Process Instances Intensive Applications 面向业务流程的平台即服务框架，用于流程实例密集型应用程序

2012 IEEE 26th International Parallel and Distributed Processing Symposium Workshops & PhD Forum Pub Date : 2012-05-21 DOI: 10.1109/IPDPSW.2012.284

Yongqing Zheng, Jinshan Pang, Jian Li, Li-zhen Cui

{"title":"Business Process Oriented Platform-as-a-Service Framework for Process Instances Intensive Applications","authors":"Yongqing Zheng, Jinshan Pang, Jian Li, Li-zhen Cui","doi":"10.1109/IPDPSW.2012.284","DOIUrl":"https://doi.org/10.1109/IPDPSW.2012.284","url":null,"abstract":"With the cloud computing becoming more and more popularity in both commercial and academic fields, platform-as-a Service (PaaS) becomes one of the core technologies for service provider to change the way of service-providing to both common users and scientific organization. This paper describes a business process oriented Platform-as-a-Service framework called BPPaaS including an integrated business process application programming model, and business process oriented Platform-as-a-Service middleware. BPPaaS can enable users to submit their business process logic source code programmed by integrated business process programming language to this platform. And BPPaaS will parse the logic source code, extract the business process tasks and task-relationship to form meta-data, and encode business process tasks as standalone executable components. Since different cloud data center has specific data, BPPaaS will assign the business process tasks to the specific data center as task execution nodes, which have the necessary data required by tasks. A scheduling algorithm is introduced to supporting business process intensive application execution with multiple heterogeneous java runtime environments as the underling parallel computation platform. Finally, a case in social security application shows this framework can streamline complex computational business process.","PeriodicalId":378335,"journal":{"name":"2012 IEEE 26th International Parallel and Distributed Processing Symposium Workshops & PhD Forum","volume":"47 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2012-05-21","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"124772444","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 2

Analysis and Optimization of Data Import with Hadoop 基于Hadoop的数据导入分析与优化

2012 IEEE 26th International Parallel and Distributed Processing Symposium Workshops & PhD Forum Pub Date : 2012-05-21 DOI: 10.1109/IPDPSW.2012.129

Weijia Xu, Wei Luo, N. Woodward

{"title":"Analysis and Optimization of Data Import with Hadoop","authors":"Weijia Xu, Wei Luo, N. Woodward","doi":"10.1109/IPDPSW.2012.129","DOIUrl":"https://doi.org/10.1109/IPDPSW.2012.129","url":null,"abstract":"Data driven research has become an important part of scientific discovery in an increasing number of disciplines. In many cases, the sheer volume of data to be processed requires not only state-of-the-art computing resources but also carefully tuned and specifically developed software. These requirements are often associated with huge operational costs and significant expertise in software development. Due to its simplicity for the user and effectiveness at processing big data, Hadoop has become a popular software platform for large-scale data analysis. Using a Hadoop cluster in a remote shared infrastructure enables users to avoid the costs of maintaining a physical infrastructure. An inevitable step in using dynamically constructed Hadoop cluster is the initial importing of the data. This process is not trivial, particularly when the size of the data is large. In this paper, we evaluate the costs of importing large-scale data into a Hadoop cluster. We present a detailed analysis of the default data importing implementation in Hadoop and conduct a practical evaluation. Our evaluation includes tests with different hardware configurations, such as different network protocol and disk configurations. We also propose an implementation to improve the performance of importing data into a Hadoop cluster wherein the data is accessed directly by Data nodes during the import process.","PeriodicalId":378335,"journal":{"name":"2012 IEEE 26th International Parallel and Distributed Processing Symposium Workshops & PhD Forum","volume":"23 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2012-05-21","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"126111188","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 20

Optimized Reduce for Mesh-Based NoC Multiprocessors 优化了基于网格的NoC多处理器的Reduce

2012 IEEE 26th International Parallel and Distributed Processing Symposium Workshops & PhD Forum Pub Date : 2012-05-21 DOI: 10.1109/IPDPSW.2012.111

A. Kohler, M. Radetzki

引用次数: 4

On Dynamic Run-time Processor Pipeline Reconfiguration 动态运行时处理器管道重新配置

2012 IEEE 26th International Parallel and Distributed Processing Symposium Workshops & PhD Forum Pub Date : 2012-05-21 DOI: 10.1109/IPDPSW.2012.53

Carsten Tradowsky, F. Thoma, M. Hübner, J. Becker

引用次数: 4

Implementation and Evaluation of Triple Precision BLAS Subroutines on GPUs 三精度BLAS子程序在gpu上的实现与评价

2012 IEEE 26th International Parallel and Distributed Processing Symposium Workshops & PhD Forum Pub Date : 2012-05-21 DOI: 10.1109/IPDPSW.2012.175

Daichi Mukunoki, D. Takahashi

引用次数: 12

The Multi-Processor Scheduling Problem in Phylogenetics 系统发育中的多处理器调度问题

2012 IEEE 26th International Parallel and Distributed Processing Symposium Workshops & PhD Forum Pub Date : 2012-05-21 DOI: 10.1109/IPDPSW.2012.86

Jiajie Zhang, A. Stamatakis

{"title":"The Multi-Processor Scheduling Problem in Phylogenetics","authors":"Jiajie Zhang, A. Stamatakis","doi":"10.1109/IPDPSW.2012.86","DOIUrl":"https://doi.org/10.1109/IPDPSW.2012.86","url":null,"abstract":"Advances in wet-lab sequencing techniques allow for sequencing between 100 genomes up to 1000 full transcriptomes of species whose evolutionary relationships shall be disentangled by means of phylogenetic analyses. Likelihood-based evolutionary models allow for partitioning such broad phylogenomic datasets, for instance into gene regions, for which likelihood model parameters (except for the tree itself) can be estimated independently. Present day phylogenomic datasets are typically split up into 1000-10,000 distinct partitions. While the likelihood on such datasets needs to be computed in parallel because of the high memory requirements, it has not yet been assessed how to optimally distribute partitions and/or alignment sites to processors, in particular when the number of cores is significantly smaller than the number of partitions. We find that, by distributing partitions (of varying lengths) monolithically to processors, the induced load distribution problem essentially corresponds to the well-known multiprocessor scheduling problem. By implementing the simple Longest Processing Time (LPT) heuristics in the PThreads and MPI version of RAxML-Light, we were able to accelerate run times by up to one order of magnitude. Other heuristics for multi-processor scheduling such as improved MultiFit, improved Zero-One, or the Three Phase approach did not yield notable performance improvements.","PeriodicalId":378335,"journal":{"name":"2012 IEEE 26th International Parallel and Distributed Processing Symposium Workshops & PhD Forum","volume":"25 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2012-05-21","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"126641254","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 12

An Effective Self-adaptive Load Balancing Algorithm for Peer-to-Peer Networks 一种有效的点对点网络自适应负载均衡算法

2012 IEEE 26th International Parallel and Distributed Processing Symposium Workshops & PhD Forum Pub Date : 2012-05-21 DOI: 10.1109/IPDPSW.2012.179

N. Xiong, Kaihua Xu, Lilong Chen, L. Yang, Yuhua Liu

{"title":"An Effective Self-adaptive Load Balancing Algorithm for Peer-to-Peer Networks","authors":"N. Xiong, Kaihua Xu, Lilong Chen, L. Yang, Yuhua Liu","doi":"10.1109/IPDPSW.2012.179","DOIUrl":"https://doi.org/10.1109/IPDPSW.2012.179","url":null,"abstract":"The field of parallel and distributed computing has become increasingly significant as recent advances in electronic and integrated circuit technologies. Peer-to-Peer (P2P) cloud computing networks are the largest contributor of network traffic on the Internet. Measurement plays an important role in different P2P applications, we should enhance the measurement-based optimization of P2P networking and applications. In especial, to enhance the file sharing efficiency in P2P networks while reducing the inter-domain traffic, extensive schemes are proposed and file sharing is becoming seriously concerned. However, difference in ability, free-riding behavior and high churn have caused great unbalance on load degree between high speed network nodes. This paper presents a self-adaptive load balancing algorithm, where nodes create binary tree back-up node tables for their shared hot files automatically, and transfer extra query quest connection sent originally to heavy-load nodes and to back-up nodes. The experimental results reveal our algorithm can reduce load degree of heavy-load nodes and bring ideal balance between high speed network nodes, although under high churn, it also has balance effect and lower load degree of the whole network systems.","PeriodicalId":378335,"journal":{"name":"2012 IEEE 26th International Parallel and Distributed Processing Symposium Workshops & PhD Forum","volume":"16 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2012-05-21","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"116000215","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 5

A QoS-Aware Service Selection Method for Cloud Service Composition 面向云服务组合的qos感知服务选择方法

2012 IEEE 26th International Parallel and Distributed Processing Symposium Workshops & PhD Forum Pub Date : 2012-05-21 DOI: 10.1109/IPDPSW.2012.278

Huihui Bao, Wanchun Dou

引用次数: 40

Performance Benefits of Heterogeneous Computing in HPC Workloads HPC工作负载下异构计算的性能优势

2012 IEEE 26th International Parallel and Distributed Processing Symposium Workshops & PhD Forum Pub Date : 2012-05-21 DOI: 10.1109/IPDPSW.2012.18

V. Lee, Edward T. Grochowski, Robert Y. Geva

{"title":"Performance Benefits of Heterogeneous Computing in HPC Workloads","authors":"V. Lee, Edward T. Grochowski, Robert Y. Geva","doi":"10.1109/IPDPSW.2012.18","DOIUrl":"https://doi.org/10.1109/IPDPSW.2012.18","url":null,"abstract":"Chip multi-processors (CMPs) with increasing number of processor cores are now becoming widely available. To take advantage of many-core CMPs, applications must be parallelized. However, due to the nature of algorithm/programming model, some parts of the application would remain serial. According to Amdahl's law, the speedup of a parallel application is limited by the amount of serial execution it has. For a CMP with many cores, this can be a serious limitation. To take full advantage of the increasing number of cores, one must try to reduce the execution time of the serial portion of a parallel program. However, rewriting an application takes time and often the return on the effort invested may not justify parallelizing every part of the program. Heterogeneous many-core CMP design is one possible solution to support massive parallel execution and to provide a reasonable single-thread performance. In this paper, we use a simple spreadsheet model to evaluate homogeneous and heterogeneous CMP designs using execution profiles of real HPC applications. Evaluated on 12 parallel HPC applications, we show that heterogeneous CMPs can outperform homogeneous CMPs by up to 1.35× with an average speedup of 1.06× when both the heterogeneous CMPs and homogeneous CMPs are constrained to use the same power budget. Our study found the heterogeneous CMPs can take advantage of serial portion of execution that is as little as 2% of total run time to provide performance benefit. This suggests heterogeneous computing can help mitigate the effect of not parallelizing some portions of an application due to return on investment concern on programming efforts.","PeriodicalId":378335,"journal":{"name":"2012 IEEE 26th International Parallel and Distributed Processing Symposium Workshops & PhD Forum","volume":"269 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2012-05-21","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"122933560","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 4