您所需要的只是超字级并行性:使用SLP的系统控制流矢量化

Proceedings of the 43rd ACM SIGPLAN International Conference on Programming Language Design and Implementation Pub Date : 2022-06-09 DOI:10.1145/3519939.3523701

Yishen Chen, Charith Mendis, Saman P. Amarasinghe

{"title":"您所需要的只是超字级并行性:使用SLP的系统控制流矢量化","authors":"Yishen Chen, Charith Mendis, Saman P. Amarasinghe","doi":"10.1145/3519939.3523701","DOIUrl":null,"url":null,"abstract":"Superword-level parallelism (SLP) vectorization is a proven technique for vectorizing straight-line code. It works by replacing independent, isomorphic instructions with equivalent vector instructions. Larsen and Amarasinghe originally proposed using SLP vectorization (together with loop unrolling) as a simpler, more flexible alternative to traditional loop vectorization. However, this vision of replacing traditional loop vectorization has not been realized because SLP vectorization cannot directly reason with control flow. In this work, we introduce SuperVectorization, a new vectorization framework that generalizes SLP vectorization to uncover parallelism that spans different basic blocks and loop nests. With the capability to systematically vectorize instructions across control-flow regions such as basic blocks and loops, our framework simultaneously subsumes the roles of inner-loop, outer-loop, and straight-line vectorizer while retaining the flexibility of SLP vectorization (e.g., partial vectorization). Our evaluation shows that a single instance of our vectorizer is competitive with and, in many cases, significantly better than LLVM’s vectorization pipeline, which includes both loop and SLP vectorizers. For example, on an unoptimized, sequential volume renderer from Pharr and Mark, our vectorizer gains a 3.28× speedup, whereas none of the production compilers that we tested vectorizes to its complex control-flow constructs.","PeriodicalId":140942,"journal":{"name":"Proceedings of the 43rd ACM SIGPLAN International Conference on Programming Language Design and Implementation","volume":"28 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2022-06-09","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"5","resultStr":"{\"title\":\"All you need is superword-level parallelism: systematic control-flow vectorization with SLP\",\"authors\":\"Yishen Chen, Charith Mendis, Saman P. Amarasinghe\",\"doi\":\"10.1145/3519939.3523701\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"Superword-level parallelism (SLP) vectorization is a proven technique for vectorizing straight-line code. It works by replacing independent, isomorphic instructions with equivalent vector instructions. Larsen and Amarasinghe originally proposed using SLP vectorization (together with loop unrolling) as a simpler, more flexible alternative to traditional loop vectorization. However, this vision of replacing traditional loop vectorization has not been realized because SLP vectorization cannot directly reason with control flow. In this work, we introduce SuperVectorization, a new vectorization framework that generalizes SLP vectorization to uncover parallelism that spans different basic blocks and loop nests. With the capability to systematically vectorize instructions across control-flow regions such as basic blocks and loops, our framework simultaneously subsumes the roles of inner-loop, outer-loop, and straight-line vectorizer while retaining the flexibility of SLP vectorization (e.g., partial vectorization). Our evaluation shows that a single instance of our vectorizer is competitive with and, in many cases, significantly better than LLVM’s vectorization pipeline, which includes both loop and SLP vectorizers. For example, on an unoptimized, sequential volume renderer from Pharr and Mark, our vectorizer gains a 3.28× speedup, whereas none of the production compilers that we tested vectorizes to its complex control-flow constructs.\",\"PeriodicalId\":140942,\"journal\":{\"name\":\"Proceedings of the 43rd ACM SIGPLAN International Conference on Programming Language Design and Implementation\",\"volume\":\"28 1\",\"pages\":\"0\"},\"PeriodicalIF\":0.0000,\"publicationDate\":\"2022-06-09\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"5\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"Proceedings of the 43rd ACM SIGPLAN International Conference on Programming Language Design and Implementation\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/10.1145/3519939.3523701\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"Proceedings of the 43rd ACM SIGPLAN International Conference on Programming Language Design and Implementation","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1145/3519939.3523701","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}

引用次数: 5

摘要

超字级并行(SLP)向量化是一种经过验证的直线代码向量化技术。它的工作原理是用等价的向量指令替换独立的同构指令。Larsen和Amarasinghe最初提出使用SLP矢量化(连同循环展开)作为传统循环矢量化的更简单、更灵活的替代方案。然而，由于SLP矢量化不能直接推理控制流，这种取代传统循环矢量化的愿景尚未实现。在这项工作中，我们引入了SuperVectorization，这是一种新的矢量化框架，它推广了SLP矢量化，以揭示跨越不同基本块和循环巢的并行性。由于具有跨控制流区域(如基本块和循环)系统地向量化指令的能力，我们的框架同时包含了内环、外环和直线向量化器的角色，同时保留了SLP向量化的灵活性(例如，部分向量化)。我们的评估表明，我们的矢量化器的单个实例与LLVM的矢量化管道具有竞争力，并且在许多情况下明显优于LLVM的矢量化管道，其中包括循环和SLP矢量化器。例如，在Pharr和Mark的未优化的顺序体积渲染器上，我们的矢量化器获得了3.28倍的加速，而我们测试的任何生产编译器都没有矢量化到其复杂的控制流结构。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

查看原文本刊更多论文

All you need is superword-level parallelism: systematic control-flow vectorization with SLP

Superword-level parallelism (SLP) vectorization is a proven technique for vectorizing straight-line code. It works by replacing independent, isomorphic instructions with equivalent vector instructions. Larsen and Amarasinghe originally proposed using SLP vectorization (together with loop unrolling) as a simpler, more flexible alternative to traditional loop vectorization. However, this vision of replacing traditional loop vectorization has not been realized because SLP vectorization cannot directly reason with control flow. In this work, we introduce SuperVectorization, a new vectorization framework that generalizes SLP vectorization to uncover parallelism that spans different basic blocks and loop nests. With the capability to systematically vectorize instructions across control-flow regions such as basic blocks and loops, our framework simultaneously subsumes the roles of inner-loop, outer-loop, and straight-line vectorizer while retaining the flexibility of SLP vectorization (e.g., partial vectorization). Our evaluation shows that a single instance of our vectorizer is competitive with and, in many cases, significantly better than LLVM’s vectorization pipeline, which includes both loop and SLP vectorizers. For example, on an unoptimized, sequential volume renderer from Pharr and Mark, our vectorizer gains a 3.28× speedup, whereas none of the production compilers that we tested vectorizes to its complex control-flow constructs.

求助全文

通过发布文献求助，成功后即可免费获取论文全文。去求助

来源期刊

Proceedings of the 43rd ACM SIGPLAN International Conference on Programming Language Design and Implementation

自引率

0.00%

发文量