Proceedings. Symposium on Computer Architecture and High Performance Computing最新文献

A Parallel Algorithm for the Facility Location Problem Applied to Oil and Gas Logistics 油气物流设施选址问题的并行算法

Proceedings. Symposium on Computer Architecture and High Performance Computing Pub Date : 2015-10-18 DOI: 10.1109/SBAC-PADW.2015.9

T. Pinheiro, M. D. Castro

引用次数: 0

Efficient irregular wavefront propagation algorithms on Intel^® Xeon Phi^™. Intel®Xeon Phi™上高效的不规则波前传播算法。

Proceedings. Symposium on Computer Architecture and High Performance Computing Pub Date : 2015-10-01 DOI: 10.1109/SBAC-PAD.2015.13

Jeremias M Gomes, George Teodoro, Alba de Melo, Jun Kong, Tahsin Kurc, Joel H Saltz

{"title":"Efficient irregular wavefront propagation algorithms on Intel® Xeon Phi™.","authors":"Jeremias M Gomes, George Teodoro, Alba de Melo, Jun Kong, Tahsin Kurc, Joel H Saltz","doi":"10.1109/SBAC-PAD.2015.13","DOIUrl":"https://doi.org/10.1109/SBAC-PAD.2015.13","url":null,"abstract":"We investigate the execution of the Irregular Wavefront Propagation Pattern (IWPP), a fundamental computing structure used in several image analysis operations, on the Intel® Xeon Phi™ co-processor. An efficient implementation of IWPP on the Xeon Phi is a challenging problem because of IWPP's irregularity and the use of atomic instructions in the original IWPP algorithm to resolve race conditions. On the Xeon Phi, the use of SIMD and vectorization instructions is critical to attain high performance. However, SIMD atomic instructions are not supported. Therefore, we propose a new IWPP algorithm that can take advantage of the supported SIMD instruction set. We also evaluate an alternate storage container (priority queue) to track active elements in the wavefront in an effort to improve the parallel algorithm efficiency. The new IWPP algorithm is evaluated with Morphological Reconstruction and Imfill operations as use cases. Our results show performance improvements of up to 5.63× on top of the original IWPP due to vectorization. Moreover, the new IWPP achieves speedups of 45.7× and 1.62×, respectively, as compared to efficient CPU and GPU implementations.","PeriodicalId":91389,"journal":{"name":"Proceedings. Symposium on Computer Architecture and High Performance Computing","volume":"2015 ","pages":"25-32"},"PeriodicalIF":0.0,"publicationDate":"2015-10-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://sci-hub-pdf.com/10.1109/SBAC-PAD.2015.13","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"34574305","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 10

Fast LH

Proceedings. Symposium on Computer Architecture and High Performance Computing Pub Date : 2013-10-23 DOI: 10.1109/SBAC-PAD.2013.15

Juan Chabkinian, Thomas J. E. Schwarz

引用次数: 1

Mapping Pipelined Applications with Replication to Increase Throughput and Reliability 用复制映射流水线应用程序以提高吞吐量和可靠性

Proceedings. Symposium on Computer Architecture and High Performance Computing Pub Date : 2010-10-27 DOI: 10.1109/SBAC-PAD.2010.16

A. Benoit, L. Marchal, Y. Robert, O. Sinnen

{"title":"Mapping Pipelined Applications with Replication to Increase Throughput and Reliability","authors":"A. Benoit, L. Marchal, Y. Robert, O. Sinnen","doi":"10.1109/SBAC-PAD.2010.16","DOIUrl":"https://doi.org/10.1109/SBAC-PAD.2010.16","url":null,"abstract":"Mapping and scheduling an application onto the processors of a parallel system is a difficult problem. This is true when performance is the only objective, but becomes worse when a second optimization criterion like reliability is involved. In this paper we investigate the problem of mapping an application consisting of several consecutive stages, i.e., a pipeline, onto heterogeneous processors, while considering both the performance, measured as throughput, and the reliability. The mechanism of replication, which refers to the mapping of an application stage onto more than one processor, can be used to increase throughput but also to increase reliability. Finding the right replication trade-off plays a pivotal role for this bi-criteria optimization problem. Our formal model includes heterogeneous processors, both in terms of execution speed as well as in terms of reliability. We study the complexity of the various sub problems and show how a solution can be obtained for the polynomial cases. For the general NP-hard problem, heuristics are presented and experimentally evaluated. We further propose the design of an exact algorithm based on A* state space search which allows us to evaluate the performance of our heuristics for small problem instances.","PeriodicalId":91389,"journal":{"name":"Proceedings. Symposium on Computer Architecture and High Performance Computing","volume":"134 1","pages":"55-62"},"PeriodicalIF":0.0,"publicationDate":"2010-10-27","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"78029937","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 5

memu: Unifying Application Modeling and Cluster Exploitation memu:统一应用建模和集群开发

Proceedings. Symposium on Computer Architecture and High Performance Computing Pub Date : 2004-01-01 DOI: 10.1109/CAHPC.2004.23

A. Alves, A. Pina, J. Exposto, J. Rufino

引用次数: 1

On the Combined Scheduling of Malleable and Rigid Jobs 柔性作业与刚性作业的联合调度研究

Proceedings. Symposium on Computer Architecture and High Performance Computing Pub Date : 2004-01-01 DOI: 10.1109/CAHPC.2004.27

Jan Hungershöfer

引用次数: 28