2014 IEEE International Parallel & Distributed Processing Symposium Workshops最新文献_第10页

A Distributed Speech Algorithm for Large Scale Data Communication Systems 面向大规模数据通信系统的分布式语音算法

2014 IEEE International Parallel & Distributed Processing Symposium Workshops Pub Date : 2014-05-19 DOI: 10.1109/IPDPSW.2014.187

N. Xiong, Guoxiang Tong, Wenzhong Guo, Jian Tan, Guanning Wu

{"title":"A Distributed Speech Algorithm for Large Scale Data Communication Systems","authors":"N. Xiong, Guoxiang Tong, Wenzhong Guo, Jian Tan, Guanning Wu","doi":"10.1109/IPDPSW.2014.187","DOIUrl":"https://doi.org/10.1109/IPDPSW.2014.187","url":null,"abstract":"Data-driven computing and using data for strategic advantages are exemplified by communication systems, and the speech intelligibility in communication systems is generally interrupted by interfering noise. This interference comes from the environmental noise, so we can reduce them intelligibility by masking the interested signal [1, 2]. An important work in communication systems is to extract speech from noisy speech and inhibiting background noise. In this paper, the subspace algorithm theory is introduced into a speech noise reduction system. We first analyze the principle of LMS adaptive speech noise reduction algorithm with the subspace algorithm, and then, we merge the subspace algorithm into the VS-LMS algorithm and propose a combined algorithm for an adaptive speech noise reduction system. Furthermore, we analyze the combined algorithm, which can decrease musical noise, as well as generate a suitable step-size factor to resolve the contradiction. This issue cannot be resolved by the current LMS algorithm [31], which has less convergence speed and larger residual noise than our system. Our simulation results demonstrate that our algorithm can get 3 to 10 times better than original algorithm in low SNR (-5 0db) and high SNR (0 ~ +5db).","PeriodicalId":153864,"journal":{"name":"2014 IEEE International Parallel & Distributed Processing Symposium Workshops","volume":"25 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2014-05-19","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"114655106","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 0

Utility Driven Dynamic Resource Management in an Oversubscribed Energy-Constrained Heterogeneous System 超额订阅能量约束异构系统中效用驱动的动态资源管理

2014 IEEE International Parallel & Distributed Processing Symposium Workshops Pub Date : 2014-05-19 DOI: 10.1109/IPDPSW.2014.12

Bhavesh Khemka, Ryan D. Friese, S. Pasricha, A. A. Maciejewski, H. Siegel, G. Koenig, Sarah Powers, Marcia Hilton, Jendra Rambharos, Steve Poole

{"title":"Utility Driven Dynamic Resource Management in an Oversubscribed Energy-Constrained Heterogeneous System","authors":"Bhavesh Khemka, Ryan D. Friese, S. Pasricha, A. A. Maciejewski, H. Siegel, G. Koenig, Sarah Powers, Marcia Hilton, Jendra Rambharos, Steve Poole","doi":"10.1109/IPDPSW.2014.12","DOIUrl":"https://doi.org/10.1109/IPDPSW.2014.12","url":null,"abstract":"In this paper, we address the problem of scheduling dynamically-arriving tasks to machines in an oversubscribed heterogeneous computing environment. Each task has a monotonically decreasing utility function associated with it that represents the utility (or value) based on the task's completion time. Our system model is designed based on the environments of interest to the Extreme Scale Systems Center at Oak Ridge National Laboratory. The goal of our scheduler is to maximize the total utility earned from task completions while satisfying an energy constraint. We design an energy-aware heuristic and compare its performance to heuristics from the literature. We also design an energy filtering technique for this environment that is used in conjunction with the heuristics. The filtering technique adapts to the energy remaining in the system and estimates a fair-share of energy that a task's execution can consume. The filtering technique improves the performance of all the heuristics and distributes the consumption of energy throughout the day. Based on our analysis, we recommend the level of filtering to maximize the performance of scheduling techniques in an oversubscribed environment.","PeriodicalId":153864,"journal":{"name":"2014 IEEE International Parallel & Distributed Processing Symposium Workshops","volume":"111 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2014-05-19","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"114763175","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 16

A Parallel Large Neighborhood Search-Based Heuristic for the Disjunctively Constrained Knapsack Problem 解离约束背包问题的并行大邻域搜索启发式算法

2014 IEEE International Parallel & Distributed Processing Symposium Workshops Pub Date : 2014-05-19 DOI: 10.1109/IPDPSW.2014.173

M. Hifi, S. Nègre, T. Saadi, Sagvan Saleh, Lei Wu

引用次数: 11

Twill: A Hybrid Microcontroller-FPGA Framework for Parallelizing Single-Threaded C Programs Twill:用于并行处理单线程C程序的混合微控制器- fpga框架

2014 IEEE International Parallel & Distributed Processing Symposium Workshops Pub Date : 2014-05-19 DOI: 10.1109/IPDPSW.2014.17

Doug Gallatin, Aaron W. Keen, C. Lupo, J. Oliver

{"title":"Twill: A Hybrid Microcontroller-FPGA Framework for Parallelizing Single-Threaded C Programs","authors":"Doug Gallatin, Aaron W. Keen, C. Lupo, J. Oliver","doi":"10.1109/IPDPSW.2014.17","DOIUrl":"https://doi.org/10.1109/IPDPSW.2014.17","url":null,"abstract":"Increasingly System-On-A-Chip platforms which incorporate both microprocessors and re-programmable logic are being utilized across several fields ranging from the automotive industry to network infrastructure. Unfortunately, the development tools accompanying these products leave much to be desired, requiring knowledge of both traditional embedded systems languages like C and hardware description languages like Verilog. We propose to bridge this gap with Twill, a truly automatic hybrid compiler that can take advantage of the parallelism inherent in these platforms. Twill can extract long-running threads from single threaded C code and distribute these threads across the hardware and software domains to more fully utilize the asymmetric characteristics between processors and the embedded reconfigurable logic fabric. We show that Twill provides a significant performance increase on the CHStone benchmarks with an average 1.63 times increase over the pure hardware approach and an increase of 22.2 times on average over the pure software approach while in general decreasing the area required by the reconfigurable logic compared to the pure hardware approach.","PeriodicalId":153864,"journal":{"name":"2014 IEEE International Parallel & Distributed Processing Symposium Workshops","volume":"1 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2014-05-19","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"124383998","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 3

Virtualization Support for FPGA-Based Coprocessors Connected via PCI Express to an Intel Multicore Platform 虚拟化支持基于fpga的协处理器通过PCI Express连接到Intel多核平台

2014 IEEE International Parallel & Distributed Processing Symposium Workshops Pub Date : 2014-05-19 DOI: 10.1109/IPDPSW.2014.42

Viet Vu Duy, T. Sandmann, S. Bähr, O. Sander, J. Becker

引用次数: 7

CoAdELL: Adaptivity and Compression for Improving Sparse Matrix-Vector Multiplication on GPUs 改进gpu上稀疏矩阵向量乘法的自适应和压缩

2014 IEEE International Parallel & Distributed Processing Symposium Workshops Pub Date : 2014-05-19 DOI: 10.1109/IPDPSW.2014.106

Marco Maggioni, T. Berger-Wolf

{"title":"CoAdELL: Adaptivity and Compression for Improving Sparse Matrix-Vector Multiplication on GPUs","authors":"Marco Maggioni, T. Berger-Wolf","doi":"10.1109/IPDPSW.2014.106","DOIUrl":"https://doi.org/10.1109/IPDPSW.2014.106","url":null,"abstract":"Numerous applications in science and engineering rely on sparse linear algebra. The efficiency of a fundamental kernel such as the Sparse Matrix-Vector multiplication (SpMV) is crucial for solving increasingly complex computational problems. However, the SpMV is notorious for its extremely low arithmetic intensity and irregular memory patterns, posing a challenge for optimization. Over the last few years, an extensive amount of literature has been devoted to implementing SpMV on Graphic Processing Units (GPUs), with the aim of exploiting the available fine-grain parallelism and memory bandwidth. In this paper, we propose to efficiently combine adaptivity and compression into an ELL-based sparse format in order to improve the state-of-the-art of the SpMV on Graphic Processing Units (GPUs). The foundation of our work is AdELL, an efficient sparse data structure based on the idea of distributing working threads to rows according to their computational load, creating balanced hardware-level blocks (warps) while coping with the irregular matrix structure. We designed a lightweight index compression scheme based on delta encoding and warp granularity that can be transparently embedded into AdELL, leading to an immediate performance benefit associated with the bandwidth-limited nature of the SpMV. The proposed integration provides a highly-optimized novel sparse matrix format known as Compressed Adaptive ELL (CoAdELL). We evaluated the effectiveness of our approach on a large set of benchmarks from heterogeneous application domains. The results show consistent improvements for double-precision SpMV calculations over the AdELL baseline. Moreover, we assessed the general relevance of CoAdELL with respect to other optimized GPU-based sparse matrix formats. We drew a direct comparison with clSpMV and BRO-HYB, obtaining sufficient experimental evidence (33% geometric average improvement over clSpMV and 43% over BRO-HYB) to propose our research work as the novel state-of-the-art.","PeriodicalId":153864,"journal":{"name":"2014 IEEE International Parallel & Distributed Processing Symposium Workshops","volume":"39 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2014-05-19","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"123240060","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 7

Online Monitoring System for Performance Fault Detection 性能故障检测在线监控系统

2014 IEEE International Parallel & Distributed Processing Symposium Workshops Pub Date : 2014-05-19 DOI: 10.1109/IPDPSW.2014.165

R. Gioiosa, Gokcen Kestor, D. Kerbyson

{"title":"Online Monitoring System for Performance Fault Detection","authors":"R. Gioiosa, Gokcen Kestor, D. Kerbyson","doi":"10.1109/IPDPSW.2014.165","DOIUrl":"https://doi.org/10.1109/IPDPSW.2014.165","url":null,"abstract":"To achieve the exaFLOPS performance within a contained power budget, next generation supercomputers will feature hundreds of millions of components operating at low- and near-threshold voltage. As the probability that at least one of these components fails during the execution of an application approaches certainty, it seems unrealistic to expect that any run of a scientific application will not experience some performance faults. We believe that there is need of a new generation of light-weight performance and debugging tools that can be used online even during production runs of parallel applications and that can identify performance anomalies during the application execution. In this work we propose the design and implementation of a monitoring system that continuously inspects the evolution of running applications and the health of the system. To achieve minimum runtime overhead while maintaining the desired level of flexibility, we propose a decoupled approach in which accurate monitoring is performed at kernel-level while performance anomaly disambiguation and corrective actions are performed at user-level. We evaluate our monitoring system on a 32-core AMD Interlagos compute node: First, we show that the runtime overhead of the monitoring system is negligible (0-2%). Then we show how our system can be used to precisely identify performance faults in two different scenarios. In the first, we inject OS noise while in the second we simulate the execution of a data analytics application next to a scientific simulation.","PeriodicalId":153864,"journal":{"name":"2014 IEEE International Parallel & Distributed Processing Symposium Workshops","volume":"96 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2014-05-19","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"115715923","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 3

GPS: Towards Simplified Communication on SGL Model GPS:基于SGL模型的简化通信

2014 IEEE International Parallel & Distributed Processing Symposium Workshops Pub Date : 2014-05-19 DOI: 10.1109/IPDPSW.2014.84

Chong Li, Gaétan Hains

{"title":"GPS: Towards Simplified Communication on SGL Model","authors":"Chong Li, Gaétan Hains","doi":"10.1109/IPDPSW.2014.84","DOIUrl":"https://doi.org/10.1109/IPDPSW.2014.84","url":null,"abstract":"Parallel programming and data-parallel algorithms have been the main techniques supporting high-performance computing for many decades. A major conceptual step was taken by L. Valiant who introduced the Bulk-Synchronous Parallel (BSP) model. Parallel algorithms on BSP can be designed and measured by taking into account not only the classical balance between time and parallel space but also communication and synchronization. Inspired by BSP, the SGL bridging model was proposed in order to improve the simplicity of parallel program development, the portability of parallel program code, and the precision of parallel algorithm performance prediction on both classical parallel machines and novel hierarchical machines. The programming model of SGL replaces the BSML (BSP-OCaml) programming primitives with scatter, gather and pardo. However, SGL does not express \"horizontal\" communication patterns. In this paper we introduce the GPS theorem which can be implemented later in a compiler to optimize the SGL's \"horizontal\" all-to-all communication. We then propose a simplified version of BSML's put based on GPS and implement Tiskin-McColl parallel sample-sort with it. The comparison of BSML's put and SGL's GPS shows that GPS has a better code readability and lower execution time.","PeriodicalId":153864,"journal":{"name":"2014 IEEE International Parallel & Distributed Processing Symposium Workshops","volume":"13 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2014-05-19","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"122636977","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 1

Minimum Set Cover of Sparsely Distributed Sensor Nodes by a Collection of Unit Disks 单位磁盘集合稀疏分布传感器节点的最小集覆盖

2014 IEEE International Parallel & Distributed Processing Symposium Workshops Pub Date : 2014-05-19 DOI: 10.1109/IPDPSW.2014.87

S. Fujita

引用次数: 1

Cloud-Based Simulation of a Smart Power Grid 基于云的智能电网仿真

2014 IEEE International Parallel & Distributed Processing Symposium Workshops Pub Date : 2014-05-19 DOI: 10.1109/IPDPSW.2014.100

Ashkan Paya, D. Marinescu

引用次数: 3