Proceedings of the 2020 Sixth Workshop on Programming Models for SIMD/Vector Processing最新文献

筛选
英文 中文
Scheduling Irregular Dataflow Pipelines on SIMD Architectures SIMD架构下的不规则数据流管道调度
Tom Plano, J. Buhler
{"title":"Scheduling Irregular Dataflow Pipelines on SIMD Architectures","authors":"Tom Plano, J. Buhler","doi":"10.1145/3380479.3380480","DOIUrl":"https://doi.org/10.1145/3380479.3380480","url":null,"abstract":"Streaming computations often exhibit substantial data parallelism that makes them well-suited to SIMD architectures. However, many such computations also exhibit irregularity, in the form of data-dependent, dynamic data rates, that makes efficient SIMD execution challenging. One aspect of this challenge is the need to schedule execution of a computation realized as a pipeline of stages connected by finite queues. A scheduler must both ensure high SIMD occupancy by gathering queued items into vectors and minimize costs associated with switching execution between stages. In this work, we present the AFIE (Active Full, Inactive Empty) scheduling policy for irregular streaming applications on SIMD processors. AFIE provably groups inputs to each stage of a pipeline into a minimal number of SIMD vectors while incurring a bounded number of switches relative to the best possible policy. These results apply even though irregularity forbids a priori knowledge of how many outputs will be generated from each input to each stage. We have implemented AFIE as an extension to the MERCATOR system [6] for building irregular streaming applications on NVIDIA GPUs. We describe how the AFIE scheduler simplifies MERCATOR's runtime code and empirically measure the new scheduler's improved performance on irregular streaming applications.","PeriodicalId":164160,"journal":{"name":"Proceedings of the 2020 Sixth Workshop on Programming Models for SIMD/Vector Processing","volume":"37 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2020-02-22","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"125108499","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 5
SIMD-based Exact Parallel Fuzzy Dilation Operator for Fast Computing of Fuzzy Spatial Relations 基于simd的模糊空间关系精确并行扩张算子
Régis Pierrard, Laurent Cabaret, Jean-Philippe Poli, C. Hudelot
{"title":"SIMD-based Exact Parallel Fuzzy Dilation Operator for Fast Computing of Fuzzy Spatial Relations","authors":"Régis Pierrard, Laurent Cabaret, Jean-Philippe Poli, C. Hudelot","doi":"10.1145/3380479.3380482","DOIUrl":"https://doi.org/10.1145/3380479.3380482","url":null,"abstract":"For decades, fuzzy spatial relations have demonstrated their utility and effectiveness for visual reasoning, including semantic annotation and object recognition. However, a major issue is that they often involve fuzzy morphological operators that are compute-intensive leading to long latency in the relation evaluation. As a result, approximate methods have been proposed to compute some relations in an acceptable time, but they are not as generic as the fuzzy dilation or do not make the most of modern computing architectures. In this paper, we introduce the Reverse and the Parallel Reverse (PR) algorithms. Reverse is an exact and efficient algorithm for the fuzzy dilation operator and PR combines the Reverse algorithm exactness with efficient usage of modern-processor multiple cores using OpenMP. Using SIMD extensions to enhance Parallel Reverse, PR128 (AVX), PR256 (AVX2), and PR512 (AVX512) are faster than the state-of-the-art approximate methods while remaining generic and exact. To demonstrate the performance of PR and highlight the contribution of the SIMD instructions, an extensive benchmark was carried out on two datasets of natural and artificial images.","PeriodicalId":164160,"journal":{"name":"Proceedings of the 2020 Sixth Workshop on Programming Models for SIMD/Vector Processing","volume":"326 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2020-02-22","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"124297867","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 2
How to speed Connected Component Labeling up with SIMD RLE algorithms 如何加速连接组件标记与SIMD RLE算法
F. Lemaitre, A. Hennequin, L. Lacassagne
{"title":"How to speed Connected Component Labeling up with SIMD RLE algorithms","authors":"F. Lemaitre, A. Hennequin, L. Lacassagne","doi":"10.1145/3380479.3380481","DOIUrl":"https://doi.org/10.1145/3380479.3380481","url":null,"abstract":"The research in Connected Component Labeling, although old, is still very active and several efficient algorithms for CPUs and GPUs have emerged during the last years and are always improving the performance. This article introduces a new SIMD run-based algorithm for CCL. We show how RLE compression can be SIMDized and used to accelerate scalar run-based CCL algorithms. A benchmark done on Intel, AMD and ARM processors shows that this new algorithm outperforms the State-of-the-Art by an average factor of x1.7 on AVX2 machines and x1.9 on Intel Xeon Skylake with AVX512.","PeriodicalId":164160,"journal":{"name":"Proceedings of the 2020 Sixth Workshop on Programming Models for SIMD/Vector Processing","volume":"87 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2020-02-22","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"128579880","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 10
0
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
确定
请完成安全验证×
相关产品
×
本文献相关产品
联系我们:info@booksci.cn Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。 Copyright © 2023 布克学术 All rights reserved.
京ICP备2023020795号-1
ghs 京公网安备 11010802042870号
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术官方微信