2015 IEEE International Parallel and Distributed Processing Symposium Workshop最新文献

筛选
英文 中文
Towards Context-Aware DNA Sequence Compression for Efficient Data Exchange 面向上下文感知DNA序列压缩的高效数据交换
2015 IEEE International Parallel and Distributed Processing Symposium Workshop Pub Date : 2015-05-25 DOI: 10.1109/IPDPSW.2015.89
Wajeeta Lohana, J. Shamsi, T. Syed, Farrukh Hasan
{"title":"Towards Context-Aware DNA Sequence Compression for Efficient Data Exchange","authors":"Wajeeta Lohana, J. Shamsi, T. Syed, Farrukh Hasan","doi":"10.1109/IPDPSW.2015.89","DOIUrl":"https://doi.org/10.1109/IPDPSW.2015.89","url":null,"abstract":"DNA sequencing has emerged as one of the principal research directions in systems biology because of its usefulness in predicting the provenance of disease but also has profound impact in other fields like biotechnology, biological systematic and forensic medicine. The experiments in high throughput DNA sequencing technology are notorious for generating DNA sequences in huge quantities, and this poses a challenge in the computation, storage and exchange of sequence data. Computing on the Cloud helps mitigate the first two challenges because it gives on-demand machines through which we are able to save cost and it gives flexibility to balance the load, both computation- and storage-wise. The problem with data exchange could be mitigated to an extent through the use of data compression. This work proposes a context-aware framework that decides the compression algorithm which can minimize the time-to-completion and efficiently utilize the resources by performing experiments on different Cloud and algorithm combinations and configurations. The results obtained from this framework and experimental setup shows that DNAX is better than rest of the algorithms in any context, but if the file size is less than 50kb then one can go for CTW or Gencompress. The Gzip algorithm which is used in the NCBI repository to store the sequences has the worst compression ratio and time.","PeriodicalId":340697,"journal":{"name":"2015 IEEE International Parallel and Distributed Processing Symposium Workshop","volume":"238 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2015-05-25","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"127203497","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 1
Empowering Fast Incremental Computation over Large Scale Dynamic Graphs 授权快速增量计算在大规模动态图形
2015 IEEE International Parallel and Distributed Processing Symposium Workshop Pub Date : 2015-05-25 DOI: 10.1109/IPDPSW.2015.136
Charith Wickramaarachchi, C. Chelmis, V. Prasanna
{"title":"Empowering Fast Incremental Computation over Large Scale Dynamic Graphs","authors":"Charith Wickramaarachchi, C. Chelmis, V. Prasanna","doi":"10.1109/IPDPSW.2015.136","DOIUrl":"https://doi.org/10.1109/IPDPSW.2015.136","url":null,"abstract":"Unprecedented growth of online social networks, communication networks and internet of things have given birth to large volume, fast changing datasets. Data generated from such systems have an inherent graph structure in it. Updates in staggering frequencies (e.g. edges created by message exchanges in online social media) impose a fundamental requirement for real-time processing of unruly yet highly interconnected data. As a result, large-scale dynamic graph processing has become a new research frontier in computer science. In this paper, we present a new vertex-centric hierarchical bulk synchronous parallel model for distributed processing of dynamic graphs. Our model allows users to easily compose static graph algorithms similar to the widely used vertex-centric model. It also enables incremental processing of dynamic graphs by automatically executing user composed static graph algorithms in an incremental manner. We map widely used single source shortest path and connected component algorithms to this model and empirically analyze the performance on real-world large scale graphs. Experimental results show that our model improves the performance of both static and dynamic graph computation compared to the vertex-centric model by reducing the global synchronization overhead.","PeriodicalId":340697,"journal":{"name":"2015 IEEE International Parallel and Distributed Processing Symposium Workshop","volume":"12 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2015-05-25","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"125305432","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 7
Parallel Methods for Optimizing High Order Constellations on GPUs gpu上高阶星座的并行优化方法
2015 IEEE International Parallel and Distributed Processing Symposium Workshop Pub Date : 2015-05-25 DOI: 10.1109/IPDPSW.2015.48
Paolo Spallaccini, F. Kayhan, Stefano Chinnici, G. Montorsi
{"title":"Parallel Methods for Optimizing High Order Constellations on GPUs","authors":"Paolo Spallaccini, F. Kayhan, Stefano Chinnici, G. Montorsi","doi":"10.1109/IPDPSW.2015.48","DOIUrl":"https://doi.org/10.1109/IPDPSW.2015.48","url":null,"abstract":"The increasing demand for fast mobile data has driven transmission systems to use high order signal constellations. Conventional modulation schemes such as QAM and APSK are sub-optimal, large gains may be obtained by properly optimizing the constellation signals set under given channel constraints. The constellation optimization problem is computationally intensive and the known methods become rapidly unfeasible as the constellation order increases. Very few attempts to optimize constellations in excess of 64 signals have been reported. In this paper, we apply a simulated annealing (SA) algorithm to maximize the Mutual Information (MI) and Pragmatic Mutual Information (PMI), given the channel constraints. We first propose a GPU accelerated method for calculating MI and PMI of a constellation. For AWGN channels the method grants one order of magnitude speedup over a CPU realization. We also propose a parallelization of the Gaussian-Hermite Quadrature to compute the Average Mutual Information (AMI) and the Pragmatic Average Mutual Information (PAMI) on GPUs. Considering the more complex problem of constellation optimization over phase noise channels, we obtain two orders of magnitude speedup over CPUs. In order to reach such performance, novel parallel algorithms have been devised. Using our method, constellations with thousands of signals can be optimized.","PeriodicalId":340697,"journal":{"name":"2015 IEEE International Parallel and Distributed Processing Symposium Workshop","volume":"6 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2015-05-25","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"125310489","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
An Automated High-Level Design Framework for Partially Reconfigurable FPGAs 部分可重构fpga的自动化高级设计框架
2015 IEEE International Parallel and Distributed Processing Symposium Workshop Pub Date : 2015-05-25 DOI: 10.1109/IPDPSW.2015.99
Rohit Kumar, A. Gordon-Ross
{"title":"An Automated High-Level Design Framework for Partially Reconfigurable FPGAs","authors":"Rohit Kumar, A. Gordon-Ross","doi":"10.1109/IPDPSW.2015.99","DOIUrl":"https://doi.org/10.1109/IPDPSW.2015.99","url":null,"abstract":"Modern field-programmable gate arrays (FPGAs) allow runtime partial reconfiguration (PR) of the FPGA, enabling PR benefits such as runtime adaptability and extensibility, and reduces the application's area requirement. However, PR application development requires non-traditional expertise and lengthy design time effort. Since high-level synthesis (HLS) languages afford fast application development time, these languages are becoming increasingly popular for FPGA application development. However, widely used HLS languages, such as C variants, do not contain PR-specific constructs, thus exploiting PR benefits using an HLS language is a challenging task. To alleviate this challenge, we present an automated high-level design framework -- PaRAT (partial reconfiguration amenability test). PaRAT parses, analyzes, and partitions an application's HLS code to generate the application's PR architectures, which contain the application's runtime modifiable modules and thus, allows the application's runtime reconfiguration. Case study analysis demonstrates PaRAT's ability to quickly and automatically generate PR architectures from an application's HLS code.","PeriodicalId":340697,"journal":{"name":"2015 IEEE International Parallel and Distributed Processing Symposium Workshop","volume":"72 3 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2015-05-25","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"116396818","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 5
Improved Internode Communication for Tile QR Decomposition for Multicore Cluster Systems 基于改进节点间通信的多核集群系统Tile QR分解
2015 IEEE International Parallel and Distributed Processing Symposium Workshop Pub Date : 2015-05-25 DOI: 10.1109/IPDPSW.2015.145
Tomohiro Suzuki
{"title":"Improved Internode Communication for Tile QR Decomposition for Multicore Cluster Systems","authors":"Tomohiro Suzuki","doi":"10.1109/IPDPSW.2015.145","DOIUrl":"https://doi.org/10.1109/IPDPSW.2015.145","url":null,"abstract":"Tile algorithms for matrix decomposition can generate many fine-grained tasks. Therefore, their suitability for processing with multicourse architecture has attracted much attention from the high-performance computing (HPC) community. Our implementation of tile QR decomposition for a cluster system has dynamic scheduling, OpenMP work- sharing, and other useful features. In this article, we discuss the problems in internodes communications that were present in our previous implementation. The improved implementation has both strong and weak scalability.","PeriodicalId":340697,"journal":{"name":"2015 IEEE International Parallel and Distributed Processing Symposium Workshop","volume":"23 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2015-05-25","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"128794415","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 1
Partial Region and Bitstream Cost Models for Hardware Multitasking on Partially Reconfigurable FPGAs 部分可重构fpga上硬件多任务处理的部分区域和比特流代价模型
2015 IEEE International Parallel and Distributed Processing Symposium Workshop Pub Date : 2015-05-25 DOI: 10.1109/IPDPSW.2015.148
Aurelio Morales-Villanueva, A. Gordon-Ross
{"title":"Partial Region and Bitstream Cost Models for Hardware Multitasking on Partially Reconfigurable FPGAs","authors":"Aurelio Morales-Villanueva, A. Gordon-Ross","doi":"10.1109/IPDPSW.2015.148","DOIUrl":"https://doi.org/10.1109/IPDPSW.2015.148","url":null,"abstract":"Partial reconfiguration (PR) on field-programmable gate arrays (FPGAs) enables multiple PR modules (PRMs) to time multiplex partially reconfigurable regions (PRRs), which affords reduced reconfiguration time, area overhead, etc., as compared to non-PR systems. However, to effectively leverage PR, system designers must determine appropriate PRR sizes/organizations during early stages of PR system design, since inappropriate PRRs, given PRM requirements, can negate PR benefits, potentially resulting in system performance worse than a functionally-equivalent non-PR design. To aid in PR system design, we present two portable, high-level cost models, which are based on the synthesis report results generated by Xilinx tools. These cost models estimate PRR size/organization given the PRR's associated PRMs to maximize the PRRs' resource utilizations and estimate the PRM's associated partial bitstream sizes based on the PRR sizes/organizations. Experiments evaluate our cost models' accuracies for different PRMs and required resources, which enable our models to afford enhanced designer productivity since these models preclude the lengthy PR design flow, which is typically required to attain such analysis.","PeriodicalId":340697,"journal":{"name":"2015 IEEE International Parallel and Distributed Processing Symposium Workshop","volume":"50 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2015-05-25","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"128372694","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 5
HiCOMB Introduction and Committees HiCOMB介绍和委员会
2015 IEEE International Parallel and Distributed Processing Symposium Workshop Pub Date : 2015-05-25 DOI: 10.1109/IPDPSW.2015.160
S. Rajasekaran, S. Aluru, David A. Bader
{"title":"HiCOMB Introduction and Committees","authors":"S. Rajasekaran, S. Aluru, David A. Bader","doi":"10.1109/IPDPSW.2015.160","DOIUrl":"https://doi.org/10.1109/IPDPSW.2015.160","url":null,"abstract":"HiCOMB Introduction and Committees","PeriodicalId":340697,"journal":{"name":"2015 IEEE International Parallel and Distributed Processing Symposium Workshop","volume":"23 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2015-05-25","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"124681133","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
EduPar Keynote EduPar主题
2015 IEEE International Parallel and Distributed Processing Symposium Workshop Pub Date : 2015-05-25 DOI: 10.1109/IPDPSW.2015.177
Geoffrey Fox
{"title":"EduPar Keynote","authors":"Geoffrey Fox","doi":"10.1109/IPDPSW.2015.177","DOIUrl":"https://doi.org/10.1109/IPDPSW.2015.177","url":null,"abstract":"We describe the Indiana University Data Science program which has a Masters, Certificate and PhD minor approved. We note the wide variety of students from hard core developers of new systems to analysts using data-intensive decision systems. We describe experience teaching two courses aimed at software:http://bigdataopensourceprojects.soic.indiana.edu/ and applications/algorithmshttps://bigdatacoursespring2015.appspot.com/preview respectively.All these courses deliver lectures online and support both non-residential students completely online with residential sections operating in \"flipped classroom\" mode. We describe experience with two broadly available technologies Google Course Builder and Microsoft Office Mix. These are both interesting incomplete platforms. We describe use of social media (forums) and support of online computing laboratory sessions. We note various mistakes we made and discuss way forward.","PeriodicalId":340697,"journal":{"name":"2015 IEEE International Parallel and Distributed Processing Symposium Workshop","volume":"1 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2015-05-25","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"129485441","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Relocation-Aware Floorplanning for Partially-Reconfigurable FPGA-Based Systems 部分可重构fpga系统的位置感知平面规划
2015 IEEE International Parallel and Distributed Processing Symposium Workshop Pub Date : 2015-05-25 DOI: 10.1109/IPDPSW.2015.52
Marco Rabozzi, Riccardo Cattaneo, Tobias Becker, W. Luk, M. Santambrogio
{"title":"Relocation-Aware Floorplanning for Partially-Reconfigurable FPGA-Based Systems","authors":"Marco Rabozzi, Riccardo Cattaneo, Tobias Becker, W. Luk, M. Santambrogio","doi":"10.1109/IPDPSW.2015.52","DOIUrl":"https://doi.org/10.1109/IPDPSW.2015.52","url":null,"abstract":"Within this paper we present a floor planner for partially-reconfigurable FPGAs that allow the designer to consider bit stream relocation constraints during the design of the system. The presented approach is an extension of our previous work on floor planning based on a Mixed-Integer Linear Programming (MILP) formulation, thus allowing the designer to optimize a set of different metrics within a user defined objective function while considering preferences related directly to relocation capabilities. Experimental results shows that the presented approach is able to reserve multiple free areas for a reconfigurable region with a small impact on the solution cost in terms of wire length and size of the configuration data.","PeriodicalId":340697,"journal":{"name":"2015 IEEE International Parallel and Distributed Processing Symposium Workshop","volume":"18 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2015-05-25","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"114220262","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 3
Folding Methods for Event Timelines in Performance Analysis 性能分析中事件时间线的折叠方法
2015 IEEE International Parallel and Distributed Processing Symposium Workshop Pub Date : 2015-05-25 DOI: 10.1109/IPDPSW.2015.47
Matthias Weber, Ronald Geisler, H. Brunst, W. Nagel
{"title":"Folding Methods for Event Timelines in Performance Analysis","authors":"Matthias Weber, Ronald Geisler, H. Brunst, W. Nagel","doi":"10.1109/IPDPSW.2015.47","DOIUrl":"https://doi.org/10.1109/IPDPSW.2015.47","url":null,"abstract":"The complexity of today's high performance computing systems, and their parallel software, requires performance analysis tools to fully understand application performance behavior. The visualization of event streams has proven to be a powerful approach for the detection of various types of performance problems. However, visualization of large numbers of process streams quickly hits the limits of available screen resolution. To alleviate this problem we propose folding strategies for event timelines that consider common questions during performance analysis. We demonstrate the effectiveness of our solution with code inefficiencies in two real-world applications, PIConGPU and COSMO-SPECS. Our methods facilitate visual scalability and provide powerful overviews of performance data at the same time. Furthermore, our folding strategies improve GPU stream visualization and allow easy evaluation of the GPU device utilization.","PeriodicalId":340697,"journal":{"name":"2015 IEEE International Parallel and Distributed Processing Symposium Workshop","volume":"135 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2015-05-25","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"116264517","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 4
0
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
确定
请完成安全验证×
相关产品
×
本文献相关产品
联系我们:info@booksci.cn Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。 Copyright © 2023 布克学术 All rights reserved.
京ICP备2023020795号-1
ghs 京公网安备 11010802042870号
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术官方微信