2014 IEEE International Parallel & Distributed Processing Symposium Workshops最新文献

筛选
英文 中文
Scalable and Reliable Data Broadcast with Kascade 可扩展和可靠的数据广播与Kascade
2014 IEEE International Parallel & Distributed Processing Symposium Workshops Pub Date : 2014-05-19 DOI: 10.1109/IPDPSW.2014.191
Stéphane Martin, Tom Buchert, Pierric Willemet, Olivier Richard, E. Jeanvoine, L. Nussbaum
{"title":"Scalable and Reliable Data Broadcast with Kascade","authors":"Stéphane Martin, Tom Buchert, Pierric Willemet, Olivier Richard, E. Jeanvoine, L. Nussbaum","doi":"10.1109/IPDPSW.2014.191","DOIUrl":"https://doi.org/10.1109/IPDPSW.2014.191","url":null,"abstract":"Many large scale scientific computations or Big Data analysis require the distribution of large amounts of data to each machine involved. That distribution of data often has a key role in the overall performance of the operation. In this paper, we present Kascade, a solution for the broadcast of data to a large set of compute nodes. We evaluate Kascade using a set of large scale experiments in a variety of experimental settings, and show that Kascade: (1) achieves very high scalability by organizing nodes in a pipeline; (2) can almost saturate a 1 Gbit/s network, even at large scale; (3) handles failures of nodes during the transfer gracefully thanks to a fault-tolerant design.","PeriodicalId":153864,"journal":{"name":"2014 IEEE International Parallel & Distributed Processing Symposium Workshops","volume":"60 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2014-05-19","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"116641668","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Exploring Large Scale Receptor-Ligand Pairs in Molecular Docking Workflows in HPC Clouds 探索HPC云分子对接流程中的大规模受体-配体对
2014 IEEE International Parallel & Distributed Processing Symposium Workshops Pub Date : 2014-05-19 DOI: 10.1109/IPDPSW.2014.65
Kary A. C. S. Ocaña, Silvia Benza, Daniel de Oliveira, Jonas Dias, M. Mattoso
{"title":"Exploring Large Scale Receptor-Ligand Pairs in Molecular Docking Workflows in HPC Clouds","authors":"Kary A. C. S. Ocaña, Silvia Benza, Daniel de Oliveira, Jonas Dias, M. Mattoso","doi":"10.1109/IPDPSW.2014.65","DOIUrl":"https://doi.org/10.1109/IPDPSW.2014.65","url":null,"abstract":"Computer-aided drug design techniques are important assets in pharmaceutical industry because of their support for research and development of new drugs. Molecular docking (MD) predicts specific compound's binding modes within the active site of target proteins. Since MD is a time-consuming process, existing approaches reduce the number of receptors or ligands in docking by evaluating only small sets of compounds. This restriction in the search space reduces the chances to uniformly cover the diverse space of compounds and misses opportunities to recognize whether new drugs can be identified. Another difficulty with large-scale is analyzing the results, e.g. browsing all directories manually to find which pairs were docked successfully. To address these issues we explored the potential of data provenance analysis and parallel processing of SciCumulus, a cloud Scientific Workflow Management System. We present SciDock, a molecular docking-based virtual screening workflow and evaluate its execution using 10,000 receptor-ligand pairs related to proteases enzymes of protozoan genomes. The overall performance of SciDock using 32 cores, in cloud virtual machines, reaches improvements up to 95.4% when running SciDock with AutoDock and 96.1% when running SciDock with Vina. We show how data provenance improved the result analysis and how it may indicate potential proteases drug targets for protozoan treatments.","PeriodicalId":153864,"journal":{"name":"2014 IEEE International Parallel & Distributed Processing Symposium Workshops","volume":"44 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2014-05-19","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"132529677","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 14
Comparison of Parallel Programming Models on Intel MIC Computer Cluster Intel MIC计算机集群上并行编程模型的比较
2014 IEEE International Parallel & Distributed Processing Symposium Workshops Pub Date : 2014-05-19 DOI: 10.1109/IPDPSW.2014.105
Chenggang Lai, Zhijun Hao, Miaoqing Huang, Xuan Shi, Haihang You
{"title":"Comparison of Parallel Programming Models on Intel MIC Computer Cluster","authors":"Chenggang Lai, Zhijun Hao, Miaoqing Huang, Xuan Shi, Haihang You","doi":"10.1109/IPDPSW.2014.105","DOIUrl":"https://doi.org/10.1109/IPDPSW.2014.105","url":null,"abstract":"Coprocessors based on Intel Many Integrated Core (MIC) Architecture have been adopted in many high-performance computer clusters. Typical parallel programming models, such as MPI and OpenMP, are supported on MIC processors to achieve the parallelism. In this work, we conduct a detailed study on the performance and scalability of the MIC processors under different programming models using the Beacon computer cluster. Followings are our findings. (1) The native MPI programming model on the MIC processors is typically better than the offload programming model, which offloads the workload to MIC cores using OpenMP, on Beacon computer cluster. (2) On top of the native MPI programming model, multithreading inside each MPI process can further improve the performance for parallel applications on computer clusters with MIC coprocessors. (3) Given a fixed number of MPI processes, it is a good strategy to schedule these MPI processes to as few MIC processors as possible to reduce the cross-processor communication overhead. (4) The hybrid MPI programming model, in which data processing is distributed to both MIC cores and CPU cores, can outperform the native MPI programming model.","PeriodicalId":153864,"journal":{"name":"2014 IEEE International Parallel & Distributed Processing Symposium Workshops","volume":"6 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2014-05-19","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"131922027","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 6
An ILP-Based Optimal Circuit Mapping Method for PLDs 基于ilp的pld最优电路映射方法
2014 IEEE International Parallel & Distributed Processing Symposium Workshops Pub Date : 2014-05-19 DOI: 10.1109/IPDPSW.2014.33
Hiroki Nishiyama, Masato Inagi, S. Wakabayashi, Shinobu Nagayama, Keisuke Inoue, M. Kaneko
{"title":"An ILP-Based Optimal Circuit Mapping Method for PLDs","authors":"Hiroki Nishiyama, Masato Inagi, S. Wakabayashi, Shinobu Nagayama, Keisuke Inoue, M. Kaneko","doi":"10.1109/IPDPSW.2014.33","DOIUrl":"https://doi.org/10.1109/IPDPSW.2014.33","url":null,"abstract":"In this paper, we discuss an ILP-based method for simultaneous optimal technology mapping, placement and routing for programmable logic devices, such as FPGAs, as a fundamental research for architecture and algorithm evaluation. In general, heuristic methods are used for technology mapping, placement and routing, and many such methods have been developed. Although they are used to obtain high quality solutions within a practical time period, high quality is not guaranteed. In addition, the separated design processes make the final solutions not optimal. Simultaneous and optimal methods are useful for evaluating and developing heuristic methods, even if optimal methods take a long time. Furthermore, they can be used to evaluate reconfigurable architectures. In experiments, we confirmed that the optimal total wire length and critical path length of small circuits were obtained using our method. Critical path lengths were reduced by 28.6% on average when optimized.","PeriodicalId":153864,"journal":{"name":"2014 IEEE International Parallel & Distributed Processing Symposium Workshops","volume":"13 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2014-05-19","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"134077766","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
CHIUW Introduction and Committees CHIUW简介及委员会
2014 IEEE International Parallel & Distributed Processing Symposium Workshops Pub Date : 2014-05-19 DOI: 10.1109/IPDPSW.2014.232
B. Chamberlain
{"title":"CHIUW Introduction and Committees","authors":"B. Chamberlain","doi":"10.1109/IPDPSW.2014.232","DOIUrl":"https://doi.org/10.1109/IPDPSW.2014.232","url":null,"abstract":"Background Chapel (http://chapel.cray.com) is an emerging parallel programming language whose design and implementation are being led by Cray Inc. in collaboration with members of computing labs, academia, and industry—both domestically and internationally. Having successfully fulfilled its research objectives under the DARPA High Productivity Computing Systems (HPCS) program that launched it, Chapel is now at the outset of a five-year effort to improve its performance, stability, and utility for real users in the field.","PeriodicalId":153864,"journal":{"name":"2014 IEEE International Parallel & Distributed Processing Symposium Workshops","volume":"14 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2014-05-19","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"133960802","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Extracting Maximal Exact Matches on GPU 在GPU上提取最大精确匹配
2014 IEEE International Parallel & Distributed Processing Symposium Workshops Pub Date : 2014-05-19 DOI: 10.1109/IPDPSW.2014.159
Anas Abu-Doleh, K. Kaya, M. Abouelhoda, Ümit V. Çatalyürek
{"title":"Extracting Maximal Exact Matches on GPU","authors":"Anas Abu-Doleh, K. Kaya, M. Abouelhoda, Ümit V. Çatalyürek","doi":"10.1109/IPDPSW.2014.159","DOIUrl":"https://doi.org/10.1109/IPDPSW.2014.159","url":null,"abstract":"The revolution in high-throughput sequencing technologies accelerated the discovery and extraction of various genomic sequences. However, the massive size of the generated datasets raise several computational problems. For example, aligning the sequences or finding the similar regions in them, which is one of the crucial steps in many bioinformatics pipelines, is a time consuming task. Maximal exact matches have been considered important to detect and evaluate the similarity. Most of the existing tools that are designed and developed to find the maximal matches are based on advanced index structures such as suffix tree or array. Although these structures triggered the development of efficient search algorithms, they need large indexing tables which yield large memory footprint for the software using them and bring significant overhead. In this article, we introduce a novel tool GPUMEM which effectively utilizes the massively parallel GPU threads while finding maximal exact matches inside two genome sequences using a lightweight indexing structure. The index construction, which is also handled in GPU, is so fast that even by including the index generation time, GPUMEM can be faster in practice than a state-of-the-art tool that uses a pre-built index.","PeriodicalId":153864,"journal":{"name":"2014 IEEE International Parallel & Distributed Processing Symposium Workshops","volume":"4 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2014-05-19","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"127907430","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 8
Hardware/Software Vectorization for Closeness Centrality on Multi-/Many-Core Architectures 多核/多核架构中接近中心性的硬件/软件矢量化
2014 IEEE International Parallel & Distributed Processing Symposium Workshops Pub Date : 2014-05-19 DOI: 10.1109/IPDPSW.2014.156
Ahmet Erdem Sarıyüce, Erik Saule, K. Kaya, Ümit V. Çatalyürek
{"title":"Hardware/Software Vectorization for Closeness Centrality on Multi-/Many-Core Architectures","authors":"Ahmet Erdem Sarıyüce, Erik Saule, K. Kaya, Ümit V. Çatalyürek","doi":"10.1109/IPDPSW.2014.156","DOIUrl":"https://doi.org/10.1109/IPDPSW.2014.156","url":null,"abstract":"Centrality metrics have shown to be highly correlated with the importance and loads of the nodes in a network. Given the scale of today's social networks, it is essential to use efficient algorithms and high performance computing techniques for their fast computation. In this work, we exploit hardware and software vectorization in combination with finegrain parallelization to compute the closeness centrality values. The proposed vectorization approach enables us to do concurrent breadth-first search operations and significantly increases the performance. We provide a comparison of different vectorization schemes and experimentally evaluate our contributions with respect to the existing parallel CPU-based solutions on cutting-edge hardware. Our implementations achieve to be 11 times faster than the state-of-the-art implementation for a graph with 234 million edges. The proposed techniques are beneficial to show how the vectorization can be efficiently utilized to execute other graph kernels that require multiple traversals over a large-scale network on cutting-edge architectures.","PeriodicalId":153864,"journal":{"name":"2014 IEEE International Parallel & Distributed Processing Symposium Workshops","volume":"23 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2014-05-19","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"121294759","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 10
EA: Research-Infused Teaching of Parallel Programming Concepts for Undergraduate Software Engineering Students 软件工程本科学生并行编程概念的研究型教学
2014 IEEE International Parallel & Distributed Processing Symposium Workshops Pub Date : 2014-05-19 DOI: 10.1109/IPDPSW.2014.122
Nasser Giacaman, O. Sinnen
{"title":"EA: Research-Infused Teaching of Parallel Programming Concepts for Undergraduate Software Engineering Students","authors":"Nasser Giacaman, O. Sinnen","doi":"10.1109/IPDPSW.2014.122","DOIUrl":"https://doi.org/10.1109/IPDPSW.2014.122","url":null,"abstract":"This paper presents experience using a research-infused teaching approach towards an undergraduate parallel programming course. The research-teaching nexus is applied at various levels, first by using research-led teaching of core parallel programming concepts, as well as teaching the latest developments from the affiliated research group. The bulk of the course, however, focuses more on the student-driven research-based and research-tutored teaching approaches, where students actively participate in groups on research projects, students are fully immersed in the learning activity of their respective project, while at the same time participating in discussions of wider parallel programming topics across other groups. This intimate affiliation between the undergraduate course and the research group results in a wide range of benefits for all those involved.","PeriodicalId":153864,"journal":{"name":"2014 IEEE International Parallel & Distributed Processing Symposium Workshops","volume":"1 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2014-05-19","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"128469643","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 7
HiPGA: A High Performance Genome Assembler for Short Read Sequence Data HiPGA:一种用于短读序列数据的高性能基因组汇编器
2014 IEEE International Parallel & Distributed Processing Symposium Workshops Pub Date : 2014-05-19 DOI: 10.1109/IPDPSW.2014.68
Xiaohui Duan, Kun Zhao, Weiguo Liu
{"title":"HiPGA: A High Performance Genome Assembler for Short Read Sequence Data","authors":"Xiaohui Duan, Kun Zhao, Weiguo Liu","doi":"10.1109/IPDPSW.2014.68","DOIUrl":"https://doi.org/10.1109/IPDPSW.2014.68","url":null,"abstract":"Emerging next-generation sequencing technologies have opened up exciting new opportunities for genome sequencing by generating read data with a massive throughput. However, the generated reads are significantly shorter compared to the traditional Sanger shotgun sequencing method. This poses challenges for de novo assembly algorithms in terms of both accuracy and efficiency. And due to the continuing explosive growth of short read databases, there is a high demand to accelerate the often repeated long-runtime assembly task. In this paper, we present a scalable parallel algorithm - HiPGA to accelerate the de Bruijn graph-based genome assembly for high-throughput short read data. In order to make full use of the compute power of both shared-memory multi-core CPUs and distributed-memory systems, we have used a parallelized file I/O scheme as well as a hybrid parallelism for the whole assembly pipeline. Evaluations using three real paired-end datasets and the Yoruba individual dataset show that compared to two other well parallelized assemblers: ABySS and PASHA, HiPGA achieves speedups up to 7 while delivering comparable accuracy on 64 CPU cores of a compute cluster.","PeriodicalId":153864,"journal":{"name":"2014 IEEE International Parallel & Distributed Processing Symposium Workshops","volume":"17 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2014-05-19","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"115558425","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 6
XSW: Accelerating Biological Database Search on Xeon Phi XSW:加速Xeon Phi处理器上的生物数据库检索
2014 IEEE International Parallel & Distributed Processing Symposium Workshops Pub Date : 2014-05-19 DOI: 10.1109/IPDPSW.2014.108
Lipeng Wang, Yuandong Chan, Xiaohui Duan, Haidong Lan, Xiangxu Meng, Weiguo Liu
{"title":"XSW: Accelerating Biological Database Search on Xeon Phi","authors":"Lipeng Wang, Yuandong Chan, Xiaohui Duan, Haidong Lan, Xiangxu Meng, Weiguo Liu","doi":"10.1109/IPDPSW.2014.108","DOIUrl":"https://doi.org/10.1109/IPDPSW.2014.108","url":null,"abstract":"In this paper we present XSW, a new parallel Smith-Waterman algorithm for searching protein sequence databases on the Xeon Phi coprocessor. In order to make full use of the compute power of the many-core Xeon Phi hardware, we have used a two-level parallelization scheme: the thread level coarse-grained and VPU level fine-grained parallelism to implement our algorithm. At the thread level, XSW employs multi-threading to implement the SIMD parallelism. At the VPU level, we have used the Knights Corner instructions to gain more data parallelism. We have also reorganized the database and made use of the parallel shuffling operations on Xeon Phi to achieve better I/O efficiency. Evaluations on real protein sequence databases show that XSW achieves the peak performance of 70 GCUPS on a single Intel Xeon Phi 7110 card. Compared to two other well parallelized Smith-Waterman algorithms: the multi-core CPU-based SWIPE and the GPU-based CUDASW++ 3.0, XSW achieves much better performance than SWIPE. And XSW achieves comparable performance but better accuracy than CUDASW++ 3.0. To our knowledge this is the first reported implementation of the Smith-Waterman algorithm on Xeon Phi. The executable binary code of XSW is available at http://sdu-hpcl.github.io/XSW/.","PeriodicalId":153864,"journal":{"name":"2014 IEEE International Parallel & Distributed Processing Symposium Workshops","volume":"21 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2014-05-19","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"114093474","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 30
0
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
确定
请完成安全验证×
相关产品
×
本文献相关产品
联系我们:info@booksci.cn Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。 Copyright © 2023 布克学术 All rights reserved.
京ICP备2023020795号-1
ghs 京公网安备 11010802042870号
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术官方微信