2014 IEEE International Parallel & Distributed Processing Symposium Workshops最新文献_第5页

Standby System Reliability through DRBD 通过DRBD实现备用系统可靠性

2014 IEEE International Parallel & Distributed Processing Symposium Workshops Pub Date : 2014-05-19 DOI: 10.1109/IPDPSW.2014.149

S. Distefano

{"title":"Standby System Reliability through DRBD","authors":"S. Distefano","doi":"10.1109/IPDPSW.2014.149","DOIUrl":"https://doi.org/10.1109/IPDPSW.2014.149","url":null,"abstract":"The standby approach is of strategic importance in current technologies, since it is able to reduce the environmental impact while extending the system lifetime, allowing to achieve a trade-off among dependability properties and costs. This is particularly interesting in the IT context where the acquired awareness on energy efficiency and environmental issues pushed towards new forms and paradigms of (\"green\") computing to specifically address such aspects. Standby policies, mechanisms and techniques are characterized by complex phenomena that should be adequately investigated. This need is translated in a strong demand of adequate tools for standby system modelling and evaluation. In this paper the dynamic reliability block diagrams (DRBD) formalism, extending RBD to the representation of dynamic reliability aspects, is proposed for adoption in the standby system evaluation. In order to fill this gap, the DRBD semantics is revised to cover the specific peculiarities of standby system. Then, the effectiveness of the DRBD approach in standby modelling is demonstrated through a case study on a critical area surveillance system, where a capacity planning parametric analysis is performed to design the system through warm standby redundant cameras according to specific reliability requirements.","PeriodicalId":153864,"journal":{"name":"2014 IEEE International Parallel & Distributed Processing Symposium Workshops","volume":"13 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2014-05-19","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"129103130","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 4

Exploring Large Scale Receptor-Ligand Pairs in Molecular Docking Workflows in HPC Clouds 探索HPC云分子对接流程中的大规模受体-配体对

2014 IEEE International Parallel & Distributed Processing Symposium Workshops Pub Date : 2014-05-19 DOI: 10.1109/IPDPSW.2014.65

Kary A. C. S. Ocaña, Silvia Benza, Daniel de Oliveira, Jonas Dias, M. Mattoso

{"title":"Exploring Large Scale Receptor-Ligand Pairs in Molecular Docking Workflows in HPC Clouds","authors":"Kary A. C. S. Ocaña, Silvia Benza, Daniel de Oliveira, Jonas Dias, M. Mattoso","doi":"10.1109/IPDPSW.2014.65","DOIUrl":"https://doi.org/10.1109/IPDPSW.2014.65","url":null,"abstract":"Computer-aided drug design techniques are important assets in pharmaceutical industry because of their support for research and development of new drugs. Molecular docking (MD) predicts specific compound's binding modes within the active site of target proteins. Since MD is a time-consuming process, existing approaches reduce the number of receptors or ligands in docking by evaluating only small sets of compounds. This restriction in the search space reduces the chances to uniformly cover the diverse space of compounds and misses opportunities to recognize whether new drugs can be identified. Another difficulty with large-scale is analyzing the results, e.g. browsing all directories manually to find which pairs were docked successfully. To address these issues we explored the potential of data provenance analysis and parallel processing of SciCumulus, a cloud Scientific Workflow Management System. We present SciDock, a molecular docking-based virtual screening workflow and evaluate its execution using 10,000 receptor-ligand pairs related to proteases enzymes of protozoan genomes. The overall performance of SciDock using 32 cores, in cloud virtual machines, reaches improvements up to 95.4% when running SciDock with AutoDock and 96.1% when running SciDock with Vina. We show how data provenance improved the result analysis and how it may indicate potential proteases drug targets for protozoan treatments.","PeriodicalId":153864,"journal":{"name":"2014 IEEE International Parallel & Distributed Processing Symposium Workshops","volume":"44 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2014-05-19","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"132529677","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 14

Comparison of Parallel Programming Models on Intel MIC Computer Cluster Intel MIC计算机集群上并行编程模型的比较

2014 IEEE International Parallel & Distributed Processing Symposium Workshops Pub Date : 2014-05-19 DOI: 10.1109/IPDPSW.2014.105

Chenggang Lai, Zhijun Hao, Miaoqing Huang, Xuan Shi, Haihang You

{"title":"Comparison of Parallel Programming Models on Intel MIC Computer Cluster","authors":"Chenggang Lai, Zhijun Hao, Miaoqing Huang, Xuan Shi, Haihang You","doi":"10.1109/IPDPSW.2014.105","DOIUrl":"https://doi.org/10.1109/IPDPSW.2014.105","url":null,"abstract":"Coprocessors based on Intel Many Integrated Core (MIC) Architecture have been adopted in many high-performance computer clusters. Typical parallel programming models, such as MPI and OpenMP, are supported on MIC processors to achieve the parallelism. In this work, we conduct a detailed study on the performance and scalability of the MIC processors under different programming models using the Beacon computer cluster. Followings are our findings. (1) The native MPI programming model on the MIC processors is typically better than the offload programming model, which offloads the workload to MIC cores using OpenMP, on Beacon computer cluster. (2) On top of the native MPI programming model, multithreading inside each MPI process can further improve the performance for parallel applications on computer clusters with MIC coprocessors. (3) Given a fixed number of MPI processes, it is a good strategy to schedule these MPI processes to as few MIC processors as possible to reduce the cross-processor communication overhead. (4) The hybrid MPI programming model, in which data processing is distributed to both MIC cores and CPU cores, can outperform the native MPI programming model.","PeriodicalId":153864,"journal":{"name":"2014 IEEE International Parallel & Distributed Processing Symposium Workshops","volume":"6 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2014-05-19","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"131922027","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 6

Optimizing the Join Operation on Hive to Accelerate Cross-Matching in Astronomy 优化Hive上的Join操作加速天文学交叉匹配

2014 IEEE International Parallel & Distributed Processing Symposium Workshops Pub Date : 2014-05-19 DOI: 10.1109/IPDPSW.2014.193

Liang Li, Dixin Tang, Taoying Liu, Hong Liu, Wei Li, Chenzhou Cui

引用次数: 5

Scalable and Reliable Data Broadcast with Kascade 可扩展和可靠的数据广播与Kascade

2014 IEEE International Parallel & Distributed Processing Symposium Workshops Pub Date : 2014-05-19 DOI: 10.1109/IPDPSW.2014.191

Stéphane Martin, Tom Buchert, Pierric Willemet, Olivier Richard, E. Jeanvoine, L. Nussbaum

引用次数: 0

Hybrid BFS Approach Using Semi-external Memory 基于半外部存储器的混合BFS方法

2014 IEEE International Parallel & Distributed Processing Symposium Workshops Pub Date : 2014-05-19 DOI: 10.1109/IPDPSW.2014.189

Keita Iwabuchi, Hitoshi Sato, Ryo Mizote, Yuichiro Yasui, K. Fujisawa, S. Matsuoka

{"title":"Hybrid BFS Approach Using Semi-external Memory","authors":"Keita Iwabuchi, Hitoshi Sato, Ryo Mizote, Yuichiro Yasui, K. Fujisawa, S. Matsuoka","doi":"10.1109/IPDPSW.2014.189","DOIUrl":"https://doi.org/10.1109/IPDPSW.2014.189","url":null,"abstract":"NVM devices will greatly expand the possibility of processing extremely large-scale graphs that exceed the DRAM capacity of the nodes, however, efficient implementation based on detailed performance analysis of access patterns of unstructured graph kernel on systems that utilize a mixture of DRAM and NVM devices has not been well investigated. We introduce a graph data offloading technique using NVMs that augment the hybrid BFS (Breadth-first search) algorithm widely used in the Graph500 benchmark, and conduct performance analysis to demonstrate the utility of NVMs for unstructured data. Experimental results of a Scale27 problem of a Kronecker graph compliant to the Graph500 benchmark show that our approach maximally sustains 4.22 Giga TEPS (Traversed Edges Per Second), reducing DRAM size by half with only 19.18% performance degradation on a 4-way AMD Opteron 6172 machine heavily equipped with NVM devices. Although direct comparison is difficult, this is significantly greater than the result of 0.05 GTEPS for a SCALE 36 problem by using 1TB of DRAM and 12 TB of NVM as reported by Pearce et al. Although our approach uses higher DRAM to NVM ratio, we show that a good compromise is achievable between performance vs. capacity ratio for processing large-scale graphs. This result as well as detailed performance analysis of the proposed technique suggests that we can process extremely large-scale graphs per node with minimum performance degradation by carefully considering the data structures of a given graph and the access patterns to both DRAM and NVM devices. As a result, our implementation has achieved 4.35 MTEPS/W(Mega TEPS per Watt) and ranked 4th on November 2013 edition of the Green Graph500 list in the Big Data category by using only a single fat server heavily equipped with NVMs.","PeriodicalId":153864,"journal":{"name":"2014 IEEE International Parallel & Distributed Processing Symposium Workshops","volume":"4 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2014-05-19","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"114951556","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 8

Towards Energy Efficient Allocation for Applications in Volunteer Cloud 面向志愿云应用的节能分配

2014 IEEE International Parallel & Distributed Processing Symposium Workshops Pub Date : 2014-05-19 DOI: 10.1109/IPDPSW.2014.169

Congfeng Jiang, Jian Wan, C. Cérin, Paolo Gianessi, Yanik Ngoko

{"title":"Towards Energy Efficient Allocation for Applications in Volunteer Cloud","authors":"Congfeng Jiang, Jian Wan, C. Cérin, Paolo Gianessi, Yanik Ngoko","doi":"10.1109/IPDPSW.2014.169","DOIUrl":"https://doi.org/10.1109/IPDPSW.2014.169","url":null,"abstract":"We can view the topology of classical clouds infrastructures as data centers to which are connected user machines. In these architectures the computations are centered on a subset of machines (the data centers) among the possible ones. In our study, we propose to consider an alternative view of clouds where both users machines and data centers are used for servicing requests. We refer to these clouds as volunteer clouds. Volunteer clouds offer potential advantages in elasticity and energy savings, but we have also to manage the unavailability of volunteer nodes. In this paper, we are interested in optimizing the energy consumed by the provisioning of applications in volunteer clouds. Given a set of applications requested by cloud's clients for a window of time, the objective is to find the deployment plan that is less energy consuming. In comparison with many works in resource allocation, our specificity is in the management of the unavailability of volunteer nodes. We show that our core challenge can be formalized as an NP-hard and inapproximable problem. We then propose an ILP (Integer Linear Programming) model and various greedy heuristics for its resolution. Finally, we provide an experimental analysis of our proposal in using realistic data and modeling for energy consumption. This work is a work on modeling with simulation results but not a work with emulation and experiments on real systems. However, the parameters and assumptions made for our simulations fit well with the knowledge generally accepted by people working on energy modeling and volunteer computing. Consequently our work should be analyzed as a solid building block towards the implementation of allocation mechanisms in volunteer clouds.","PeriodicalId":153864,"journal":{"name":"2014 IEEE International Parallel & Distributed Processing Symposium Workshops","volume":"42 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2014-05-19","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"124578659","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 4

A Game-Theoretic Approach to Multiobjective Job Scheduling in Cloud Computing Systems 云计算系统中多目标作业调度的博弈论方法

2014 IEEE International Parallel & Distributed Processing Symposium Workshops Pub Date : 2014-05-19 DOI: 10.1109/IPDPSW.2014.60

Jakub Gasior, F. Seredyński

引用次数: 1

A Novel Computational Model for GPUs with Application to I/O Optimal Sorting Algorithms 一种新的gpu计算模型及其在I/O优化排序算法中的应用

2014 IEEE International Parallel & Distributed Processing Symposium Workshops Pub Date : 2014-05-19 DOI: 10.1109/IPDPSW.2014.72

A. Koike, K. Sadakane

引用次数: 5

Optimizing Krylov Subspace Solvers on Graphics Processing Units 在图形处理单元上优化Krylov子空间求解器

2014 IEEE International Parallel & Distributed Processing Symposium Workshops Pub Date : 2014-05-19 DOI: 10.1109/IPDPSW.2014.107

H. Anzt, W. Sawyer, S. Tomov, P. Luszczek, I. Yamazaki, J. Dongarra

{"title":"Optimizing Krylov Subspace Solvers on Graphics Processing Units","authors":"H. Anzt, W. Sawyer, S. Tomov, P. Luszczek, I. Yamazaki, J. Dongarra","doi":"10.1109/IPDPSW.2014.107","DOIUrl":"https://doi.org/10.1109/IPDPSW.2014.107","url":null,"abstract":"Krylov subspace solvers are often the method of choice when solving sparse linear systems iteratively. At the same time, hardware accelerators such as graphics processing units (GPUs) continue to offer significant floating point performance gains for matrix and vector computations through easy-to-use libraries of computational kernels. However, as these libraries are usually composed of a well optimized but limited set of linear algebra operations, applications that use them often fail to leverage the full potential of the accelerator. In this paper we target the acceleration of the BiCGSTAB solver for GPUs, showing that significant improvement can be achieved by reformulating the method and developing application-specific kernels instead of using the generic CUBLAS library provided by NVIDIA. We propose an implementation that benefits from a significantly reduced number of kernel launches and GPU-host communication events, by means of increased data locality and a simultaneous reduction of multiple scalar products. Using experimental data, we show that, depending on the dominance of the untouched sparse matrix vector products, significant performance improvements can be achieved compared to a reference implementation based on the CUBLAS library. We feel that such optimizations are crucial for the subsequent development of high-level sparse linear algebra libraries.","PeriodicalId":153864,"journal":{"name":"2014 IEEE International Parallel & Distributed Processing Symposium Workshops","volume":"14 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2014-05-19","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"128066917","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 26