2014 IEEE International Parallel & Distributed Processing Symposium Workshops最新文献

筛选
英文 中文
A Distributed Speech Algorithm for Large Scale Data Communication Systems 面向大规模数据通信系统的分布式语音算法
2014 IEEE International Parallel & Distributed Processing Symposium Workshops Pub Date : 2014-05-19 DOI: 10.1109/IPDPSW.2014.187
N. Xiong, Guoxiang Tong, Wenzhong Guo, Jian Tan, Guanning Wu
{"title":"A Distributed Speech Algorithm for Large Scale Data Communication Systems","authors":"N. Xiong, Guoxiang Tong, Wenzhong Guo, Jian Tan, Guanning Wu","doi":"10.1109/IPDPSW.2014.187","DOIUrl":"https://doi.org/10.1109/IPDPSW.2014.187","url":null,"abstract":"Data-driven computing and using data for strategic advantages are exemplified by communication systems, and the speech intelligibility in communication systems is generally interrupted by interfering noise. This interference comes from the environmental noise, so we can reduce them intelligibility by masking the interested signal [1, 2]. An important work in communication systems is to extract speech from noisy speech and inhibiting background noise. In this paper, the subspace algorithm theory is introduced into a speech noise reduction system. We first analyze the principle of LMS adaptive speech noise reduction algorithm with the subspace algorithm, and then, we merge the subspace algorithm into the VS-LMS algorithm and propose a combined algorithm for an adaptive speech noise reduction system. Furthermore, we analyze the combined algorithm, which can decrease musical noise, as well as generate a suitable step-size factor to resolve the contradiction. This issue cannot be resolved by the current LMS algorithm [31], which has less convergence speed and larger residual noise than our system. Our simulation results demonstrate that our algorithm can get 3 to 10 times better than original algorithm in low SNR (-5 0db) and high SNR (0 ~ +5db).","PeriodicalId":153864,"journal":{"name":"2014 IEEE International Parallel & Distributed Processing Symposium Workshops","volume":"25 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2014-05-19","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"114655106","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Utility Driven Dynamic Resource Management in an Oversubscribed Energy-Constrained Heterogeneous System 超额订阅能量约束异构系统中效用驱动的动态资源管理
2014 IEEE International Parallel & Distributed Processing Symposium Workshops Pub Date : 2014-05-19 DOI: 10.1109/IPDPSW.2014.12
Bhavesh Khemka, Ryan D. Friese, S. Pasricha, A. A. Maciejewski, H. Siegel, G. Koenig, Sarah Powers, Marcia Hilton, Jendra Rambharos, Steve Poole
{"title":"Utility Driven Dynamic Resource Management in an Oversubscribed Energy-Constrained Heterogeneous System","authors":"Bhavesh Khemka, Ryan D. Friese, S. Pasricha, A. A. Maciejewski, H. Siegel, G. Koenig, Sarah Powers, Marcia Hilton, Jendra Rambharos, Steve Poole","doi":"10.1109/IPDPSW.2014.12","DOIUrl":"https://doi.org/10.1109/IPDPSW.2014.12","url":null,"abstract":"In this paper, we address the problem of scheduling dynamically-arriving tasks to machines in an oversubscribed heterogeneous computing environment. Each task has a monotonically decreasing utility function associated with it that represents the utility (or value) based on the task's completion time. Our system model is designed based on the environments of interest to the Extreme Scale Systems Center at Oak Ridge National Laboratory. The goal of our scheduler is to maximize the total utility earned from task completions while satisfying an energy constraint. We design an energy-aware heuristic and compare its performance to heuristics from the literature. We also design an energy filtering technique for this environment that is used in conjunction with the heuristics. The filtering technique adapts to the energy remaining in the system and estimates a fair-share of energy that a task's execution can consume. The filtering technique improves the performance of all the heuristics and distributes the consumption of energy throughout the day. Based on our analysis, we recommend the level of filtering to maximize the performance of scheduling techniques in an oversubscribed environment.","PeriodicalId":153864,"journal":{"name":"2014 IEEE International Parallel & Distributed Processing Symposium Workshops","volume":"111 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2014-05-19","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"114763175","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 16
A Parallel Large Neighborhood Search-Based Heuristic for the Disjunctively Constrained Knapsack Problem 解离约束背包问题的并行大邻域搜索启发式算法
2014 IEEE International Parallel & Distributed Processing Symposium Workshops Pub Date : 2014-05-19 DOI: 10.1109/IPDPSW.2014.173
M. Hifi, S. Nègre, T. Saadi, Sagvan Saleh, Lei Wu
{"title":"A Parallel Large Neighborhood Search-Based Heuristic for the Disjunctively Constrained Knapsack Problem","authors":"M. Hifi, S. Nègre, T. Saadi, Sagvan Saleh, Lei Wu","doi":"10.1109/IPDPSW.2014.173","DOIUrl":"https://doi.org/10.1109/IPDPSW.2014.173","url":null,"abstract":"This paper proposes a parallel large neighborhood search-based heuristic for solving the Disjunctively Constrained Knapsack Problem (DCKP), which has an important impact on the transportation issues. The proposed approach is designed using Message Passing Interface (MPI). The effectiveness of MPI's allows us to build a flexible message passing model of parallel programming. Meanwhile, large neighborhood search heuristic is introduced in the model in order to propose an efficient resolution method yielding high quality solutions. The results provided by the proposed method are compared to those reached by the Cplex solver and to those obtained by one of the best methods of the literature. As shown from the experimental results, the proposed model is able to provide high quality solutions with fast runtime on most cases of the benchmark literature.","PeriodicalId":153864,"journal":{"name":"2014 IEEE International Parallel & Distributed Processing Symposium Workshops","volume":"6 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2014-05-19","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"124336168","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 11
Twill: A Hybrid Microcontroller-FPGA Framework for Parallelizing Single-Threaded C Programs Twill:用于并行处理单线程C程序的混合微控制器- fpga框架
2014 IEEE International Parallel & Distributed Processing Symposium Workshops Pub Date : 2014-05-19 DOI: 10.1109/IPDPSW.2014.17
Doug Gallatin, Aaron W. Keen, C. Lupo, J. Oliver
{"title":"Twill: A Hybrid Microcontroller-FPGA Framework for Parallelizing Single-Threaded C Programs","authors":"Doug Gallatin, Aaron W. Keen, C. Lupo, J. Oliver","doi":"10.1109/IPDPSW.2014.17","DOIUrl":"https://doi.org/10.1109/IPDPSW.2014.17","url":null,"abstract":"Increasingly System-On-A-Chip platforms which incorporate both microprocessors and re-programmable logic are being utilized across several fields ranging from the automotive industry to network infrastructure. Unfortunately, the development tools accompanying these products leave much to be desired, requiring knowledge of both traditional embedded systems languages like C and hardware description languages like Verilog. We propose to bridge this gap with Twill, a truly automatic hybrid compiler that can take advantage of the parallelism inherent in these platforms. Twill can extract long-running threads from single threaded C code and distribute these threads across the hardware and software domains to more fully utilize the asymmetric characteristics between processors and the embedded reconfigurable logic fabric. We show that Twill provides a significant performance increase on the CHStone benchmarks with an average 1.63 times increase over the pure hardware approach and an increase of 22.2 times on average over the pure software approach while in general decreasing the area required by the reconfigurable logic compared to the pure hardware approach.","PeriodicalId":153864,"journal":{"name":"2014 IEEE International Parallel & Distributed Processing Symposium Workshops","volume":"1 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2014-05-19","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"124383998","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 3
Virtualization Support for FPGA-Based Coprocessors Connected via PCI Express to an Intel Multicore Platform 虚拟化支持基于fpga的协处理器通过PCI Express连接到Intel多核平台
2014 IEEE International Parallel & Distributed Processing Symposium Workshops Pub Date : 2014-05-19 DOI: 10.1109/IPDPSW.2014.42
Viet Vu Duy, T. Sandmann, S. Bähr, O. Sander, J. Becker
{"title":"Virtualization Support for FPGA-Based Coprocessors Connected via PCI Express to an Intel Multicore Platform","authors":"Viet Vu Duy, T. Sandmann, S. Bähr, O. Sander, J. Becker","doi":"10.1109/IPDPSW.2014.42","DOIUrl":"https://doi.org/10.1109/IPDPSW.2014.42","url":null,"abstract":"In automotive electronics, the approach to integrate several existing single-core electronics control units into a multicore computer platform is now emerging. The integration may result in mixed-criticality systems where robust segregation between software applications is crucial. Another requirement for this process is the reusability of legacy software. Virtualization is a promising technique which can help to solve these problems. In this paper, we present a hardware software co-designed virtualization support for FPGA-based coprocessors, which are connected via PCI Express to an Intel multicore platform. Experimental results show that our approach outperforms completely software-based virtualization approaches by upto 3.13 times for read and upto 26.26 times for write operations.","PeriodicalId":153864,"journal":{"name":"2014 IEEE International Parallel & Distributed Processing Symposium Workshops","volume":"13 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2014-05-19","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"116638359","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 7
CoAdELL: Adaptivity and Compression for Improving Sparse Matrix-Vector Multiplication on GPUs 改进gpu上稀疏矩阵向量乘法的自适应和压缩
2014 IEEE International Parallel & Distributed Processing Symposium Workshops Pub Date : 2014-05-19 DOI: 10.1109/IPDPSW.2014.106
Marco Maggioni, T. Berger-Wolf
{"title":"CoAdELL: Adaptivity and Compression for Improving Sparse Matrix-Vector Multiplication on GPUs","authors":"Marco Maggioni, T. Berger-Wolf","doi":"10.1109/IPDPSW.2014.106","DOIUrl":"https://doi.org/10.1109/IPDPSW.2014.106","url":null,"abstract":"Numerous applications in science and engineering rely on sparse linear algebra. The efficiency of a fundamental kernel such as the Sparse Matrix-Vector multiplication (SpMV) is crucial for solving increasingly complex computational problems. However, the SpMV is notorious for its extremely low arithmetic intensity and irregular memory patterns, posing a challenge for optimization. Over the last few years, an extensive amount of literature has been devoted to implementing SpMV on Graphic Processing Units (GPUs), with the aim of exploiting the available fine-grain parallelism and memory bandwidth. In this paper, we propose to efficiently combine adaptivity and compression into an ELL-based sparse format in order to improve the state-of-the-art of the SpMV on Graphic Processing Units (GPUs). The foundation of our work is AdELL, an efficient sparse data structure based on the idea of distributing working threads to rows according to their computational load, creating balanced hardware-level blocks (warps) while coping with the irregular matrix structure. We designed a lightweight index compression scheme based on delta encoding and warp granularity that can be transparently embedded into AdELL, leading to an immediate performance benefit associated with the bandwidth-limited nature of the SpMV. The proposed integration provides a highly-optimized novel sparse matrix format known as Compressed Adaptive ELL (CoAdELL). We evaluated the effectiveness of our approach on a large set of benchmarks from heterogeneous application domains. The results show consistent improvements for double-precision SpMV calculations over the AdELL baseline. Moreover, we assessed the general relevance of CoAdELL with respect to other optimized GPU-based sparse matrix formats. We drew a direct comparison with clSpMV and BRO-HYB, obtaining sufficient experimental evidence (33% geometric average improvement over clSpMV and 43% over BRO-HYB) to propose our research work as the novel state-of-the-art.","PeriodicalId":153864,"journal":{"name":"2014 IEEE International Parallel & Distributed Processing Symposium Workshops","volume":"39 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2014-05-19","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"123240060","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 7
Online Monitoring System for Performance Fault Detection 性能故障检测在线监控系统
2014 IEEE International Parallel & Distributed Processing Symposium Workshops Pub Date : 2014-05-19 DOI: 10.1109/IPDPSW.2014.165
R. Gioiosa, Gokcen Kestor, D. Kerbyson
{"title":"Online Monitoring System for Performance Fault Detection","authors":"R. Gioiosa, Gokcen Kestor, D. Kerbyson","doi":"10.1109/IPDPSW.2014.165","DOIUrl":"https://doi.org/10.1109/IPDPSW.2014.165","url":null,"abstract":"To achieve the exaFLOPS performance within a contained power budget, next generation supercomputers will feature hundreds of millions of components operating at low- and near-threshold voltage. As the probability that at least one of these components fails during the execution of an application approaches certainty, it seems unrealistic to expect that any run of a scientific application will not experience some performance faults. We believe that there is need of a new generation of light-weight performance and debugging tools that can be used online even during production runs of parallel applications and that can identify performance anomalies during the application execution. In this work we propose the design and implementation of a monitoring system that continuously inspects the evolution of running applications and the health of the system. To achieve minimum runtime overhead while maintaining the desired level of flexibility, we propose a decoupled approach in which accurate monitoring is performed at kernel-level while performance anomaly disambiguation and corrective actions are performed at user-level. We evaluate our monitoring system on a 32-core AMD Interlagos compute node: First, we show that the runtime overhead of the monitoring system is negligible (0-2%). Then we show how our system can be used to precisely identify performance faults in two different scenarios. In the first, we inject OS noise while in the second we simulate the execution of a data analytics application next to a scientific simulation.","PeriodicalId":153864,"journal":{"name":"2014 IEEE International Parallel & Distributed Processing Symposium Workshops","volume":"96 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2014-05-19","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"115715923","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 3
GPS: Towards Simplified Communication on SGL Model GPS:基于SGL模型的简化通信
2014 IEEE International Parallel & Distributed Processing Symposium Workshops Pub Date : 2014-05-19 DOI: 10.1109/IPDPSW.2014.84
Chong Li, Gaétan Hains
{"title":"GPS: Towards Simplified Communication on SGL Model","authors":"Chong Li, Gaétan Hains","doi":"10.1109/IPDPSW.2014.84","DOIUrl":"https://doi.org/10.1109/IPDPSW.2014.84","url":null,"abstract":"Parallel programming and data-parallel algorithms have been the main techniques supporting high-performance computing for many decades. A major conceptual step was taken by L. Valiant who introduced the Bulk-Synchronous Parallel (BSP) model. Parallel algorithms on BSP can be designed and measured by taking into account not only the classical balance between time and parallel space but also communication and synchronization. Inspired by BSP, the SGL bridging model was proposed in order to improve the simplicity of parallel program development, the portability of parallel program code, and the precision of parallel algorithm performance prediction on both classical parallel machines and novel hierarchical machines. The programming model of SGL replaces the BSML (BSP-OCaml) programming primitives with scatter, gather and pardo. However, SGL does not express \"horizontal\" communication patterns. In this paper we introduce the GPS theorem which can be implemented later in a compiler to optimize the SGL's \"horizontal\" all-to-all communication. We then propose a simplified version of BSML's put based on GPS and implement Tiskin-McColl parallel sample-sort with it. The comparison of BSML's put and SGL's GPS shows that GPS has a better code readability and lower execution time.","PeriodicalId":153864,"journal":{"name":"2014 IEEE International Parallel & Distributed Processing Symposium Workshops","volume":"13 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2014-05-19","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"122636977","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 1
Minimum Set Cover of Sparsely Distributed Sensor Nodes by a Collection of Unit Disks 单位磁盘集合稀疏分布传感器节点的最小集覆盖
2014 IEEE International Parallel & Distributed Processing Symposium Workshops Pub Date : 2014-05-19 DOI: 10.1109/IPDPSW.2014.87
S. Fujita
{"title":"Minimum Set Cover of Sparsely Distributed Sensor Nodes by a Collection of Unit Disks","authors":"S. Fujita","doi":"10.1109/IPDPSW.2014.87","DOIUrl":"https://doi.org/10.1109/IPDPSW.2014.87","url":null,"abstract":"In this paper, we consider the problem of covering vertices distributed over a two-dimensional plane with as small number of unit disks as possible. This restricted version of the set cover problem is motivated by the problem of reducing the network cost of Wireless Sensor Networks which have been widely used in recent years. The main contribution of the current paper is the proposal of an exact algorithm for solving the minimum set cover by unit disks which outputs an optimum solution in O(n2(10/e)√nlog2 n) time, where ε is the minimum distance between district vertices in the plane. This result indicates that if sensor nodes are \"sparsely\" distributed over the region so that the distance to the closest sensor is long enough compared with the transmission radius of the base station (i.e., when ε = ω(1√n)), we can significantly reduce the running time of the previous algorithm which solves the ordinary set cover problem in an exact manner.","PeriodicalId":153864,"journal":{"name":"2014 IEEE International Parallel & Distributed Processing Symposium Workshops","volume":"122 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2014-05-19","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"123318630","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 1
Cloud-Based Simulation of a Smart Power Grid 基于云的智能电网仿真
2014 IEEE International Parallel & Distributed Processing Symposium Workshops Pub Date : 2014-05-19 DOI: 10.1109/IPDPSW.2014.100
Ashkan Paya, D. Marinescu
{"title":"Cloud-Based Simulation of a Smart Power Grid","authors":"Ashkan Paya, D. Marinescu","doi":"10.1109/IPDPSW.2014.100","DOIUrl":"https://doi.org/10.1109/IPDPSW.2014.100","url":null,"abstract":"Is it feasible to automatically generate a cloud environment for applications based on a dynamic computational model when the actual work flow changes in time? We discuss the answer to this question in the context of a complex application, the simulation of a smart grid. We argue that the IaaS cloud delivery model offers enough flexibility and that the Amazon Web Services have evolved to the point when automatic generation of a computing environment is not only feasible, but also leads to an efficient computing infrastructure. In this paper we develop a a model of a smart power grid and then investigate the means to reduce the time needed for the automatic generation of the simulation environment and to reduce the overall cost of the simulation.","PeriodicalId":153864,"journal":{"name":"2014 IEEE International Parallel & Distributed Processing Symposium Workshops","volume":"1 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2014-05-19","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"124568639","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 3
0
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
确定
请完成安全验证×
相关产品
×
本文献相关产品
联系我们:info@booksci.cn Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。 Copyright © 2023 布克学术 All rights reserved.
京ICP备2023020795号-1
ghs 京公网安备 11010802042870号
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术官方微信