Proceedings of the 2020 Sixth Workshop on Programming Models for SIMD/Vector Processing最新文献

筛选

英文中文

Scheduling Irregular Dataflow Pipelines on SIMD Architectures SIMD架构下的不规则数据流管道调度

Proceedings of the 2020 Sixth Workshop on Programming Models for SIMD/Vector Processing Pub Date : 2020-02-22 DOI: 10.1145/3380479.3380480

Tom Plano, J. Buhler

{"title":"Scheduling Irregular Dataflow Pipelines on SIMD Architectures","authors":"Tom Plano, J. Buhler","doi":"10.1145/3380479.3380480","DOIUrl":"https://doi.org/10.1145/3380479.3380480","url":null,"abstract":"Streaming computations often exhibit substantial data parallelism that makes them well-suited to SIMD architectures. However, many such computations also exhibit irregularity, in the form of data-dependent, dynamic data rates, that makes efficient SIMD execution challenging. One aspect of this challenge is the need to schedule execution of a computation realized as a pipeline of stages connected by finite queues. A scheduler must both ensure high SIMD occupancy by gathering queued items into vectors and minimize costs associated with switching execution between stages. In this work, we present the AFIE (Active Full, Inactive Empty) scheduling policy for irregular streaming applications on SIMD processors. AFIE provably groups inputs to each stage of a pipeline into a minimal number of SIMD vectors while incurring a bounded number of switches relative to the best possible policy. These results apply even though irregularity forbids a priori knowledge of how many outputs will be generated from each input to each stage. We have implemented AFIE as an extension to the MERCATOR system [6] for building irregular streaming applications on NVIDIA GPUs. We describe how the AFIE scheduler simplifies MERCATOR's runtime code and empirically measure the new scheduler's improved performance on irregular streaming applications.","PeriodicalId":164160,"journal":{"name":"Proceedings of the 2020 Sixth Workshop on Programming Models for SIMD/Vector Processing","volume":"37 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2020-02-22","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"125108499","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 5

SIMD-based Exact Parallel Fuzzy Dilation Operator for Fast Computing of Fuzzy Spatial Relations 基于simd的模糊空间关系精确并行扩张算子

Proceedings of the 2020 Sixth Workshop on Programming Models for SIMD/Vector Processing Pub Date : 2020-02-22 DOI: 10.1145/3380479.3380482

Régis Pierrard, Laurent Cabaret, Jean-Philippe Poli, C. Hudelot

{"title":"SIMD-based Exact Parallel Fuzzy Dilation Operator for Fast Computing of Fuzzy Spatial Relations","authors":"Régis Pierrard, Laurent Cabaret, Jean-Philippe Poli, C. Hudelot","doi":"10.1145/3380479.3380482","DOIUrl":"https://doi.org/10.1145/3380479.3380482","url":null,"abstract":"For decades, fuzzy spatial relations have demonstrated their utility and effectiveness for visual reasoning, including semantic annotation and object recognition. However, a major issue is that they often involve fuzzy morphological operators that are compute-intensive leading to long latency in the relation evaluation. As a result, approximate methods have been proposed to compute some relations in an acceptable time, but they are not as generic as the fuzzy dilation or do not make the most of modern computing architectures. In this paper, we introduce the Reverse and the Parallel Reverse (PR) algorithms. Reverse is an exact and efficient algorithm for the fuzzy dilation operator and PR combines the Reverse algorithm exactness with efficient usage of modern-processor multiple cores using OpenMP. Using SIMD extensions to enhance Parallel Reverse, PR128 (AVX), PR256 (AVX2), and PR512 (AVX512) are faster than the state-of-the-art approximate methods while remaining generic and exact. To demonstrate the performance of PR and highlight the contribution of the SIMD instructions, an extensive benchmark was carried out on two datasets of natural and artificial images.","PeriodicalId":164160,"journal":{"name":"Proceedings of the 2020 Sixth Workshop on Programming Models for SIMD/Vector Processing","volume":"326 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2020-02-22","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"124297867","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 2

How to speed Connected Component Labeling up with SIMD RLE algorithms 如何加速连接组件标记与SIMD RLE算法

Proceedings of the 2020 Sixth Workshop on Programming Models for SIMD/Vector Processing Pub Date : 2020-02-22 DOI: 10.1145/3380479.3380481

F. Lemaitre, A. Hennequin, L. Lacassagne

引用次数: 10