在gpu上简化分析查询处理的案例

Proceedings of the 17th International Workshop on Data Management on New Hardware Pub Date : 2021-06-20 DOI:10.1145/3465998.3466015

Johannes Fett, A. Ungethüm, Dirk Habich, Wolfgang Lehner

{"title":"在gpu上简化分析查询处理的案例","authors":"Johannes Fett, A. Ungethüm, Dirk Habich, Wolfgang Lehner","doi":"10.1145/3465998.3466015","DOIUrl":null,"url":null,"abstract":"Data-level parallelism (DLP) is a heavily used hardware-driven parallelization technique to optimize the analytical query processing, especially in in-memory column stores. This kind of parallelism is characterized by executing essentially the same operation on different data elements simultaneously. Besides Single Instruction Multiple Data (SIMD) extensions on common x86-processors, GPUs also provide DLP but with a different execution model called Single Instruction Multiple Threads (SIMT), where multiple scalar threads are executed in a SIMD manner. Unfortunately, a complete GPU-specific implementation of all query operators has to be set up, since the state of the vectorized implementations cannot be ported from x86-processors to GPUs right now. To avoid this implementation effort, we present our vision to virtualize GPUs as virtual vector engines with software-defined SIMD instructions and to specialize hardware-oblivious vectorized operators to GPUs using our Template Vector Library (TVL) in this paper.","PeriodicalId":183683,"journal":{"name":"Proceedings of the 17th International Workshop on Data Management on New Hardware","volume":"15 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2021-06-20","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":"{\"title\":\"The Case for SIMDified Analytical Query Processing on GPUs\",\"authors\":\"Johannes Fett, A. Ungethüm, Dirk Habich, Wolfgang Lehner\",\"doi\":\"10.1145/3465998.3466015\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"Data-level parallelism (DLP) is a heavily used hardware-driven parallelization technique to optimize the analytical query processing, especially in in-memory column stores. This kind of parallelism is characterized by executing essentially the same operation on different data elements simultaneously. Besides Single Instruction Multiple Data (SIMD) extensions on common x86-processors, GPUs also provide DLP but with a different execution model called Single Instruction Multiple Threads (SIMT), where multiple scalar threads are executed in a SIMD manner. Unfortunately, a complete GPU-specific implementation of all query operators has to be set up, since the state of the vectorized implementations cannot be ported from x86-processors to GPUs right now. To avoid this implementation effort, we present our vision to virtualize GPUs as virtual vector engines with software-defined SIMD instructions and to specialize hardware-oblivious vectorized operators to GPUs using our Template Vector Library (TVL) in this paper.\",\"PeriodicalId\":183683,\"journal\":{\"name\":\"Proceedings of the 17th International Workshop on Data Management on New Hardware\",\"volume\":\"15 1\",\"pages\":\"0\"},\"PeriodicalIF\":0.0000,\"publicationDate\":\"2021-06-20\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"0\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"Proceedings of the 17th International Workshop on Data Management on New Hardware\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/10.1145/3465998.3466015\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"Proceedings of the 17th International Workshop on Data Management on New Hardware","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1145/3465998.3466015","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}

引用次数: 0

摘要

数据级并行(DLP)是一种广泛使用的硬件驱动并行化技术，用于优化分析查询处理，特别是在内存列存储中。这种并行性的特点是对不同的数据元素同时执行本质上相同的操作。除了普通x86处理器上的单指令多数据(SIMD)扩展外，gpu还提供DLP，但使用不同的执行模型，称为单指令多线程(SIMT)，其中多个标量线程以SIMD方式执行。不幸的是，必须为所有查询操作符设置一个完整的特定于gpu的实现，因为矢量化实现的状态目前无法从x86处理器移植到gpu。为了避免这种实现工作，我们提出了将gpu虚拟化为具有软件定义SIMD指令的虚拟向量引擎的愿景，并在本文中使用我们的模板向量库(TVL)将gpu专用于硬件无关的矢量化操作符。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

查看原文本刊更多论文

The Case for SIMDified Analytical Query Processing on GPUs

Data-level parallelism (DLP) is a heavily used hardware-driven parallelization technique to optimize the analytical query processing, especially in in-memory column stores. This kind of parallelism is characterized by executing essentially the same operation on different data elements simultaneously. Besides Single Instruction Multiple Data (SIMD) extensions on common x86-processors, GPUs also provide DLP but with a different execution model called Single Instruction Multiple Threads (SIMT), where multiple scalar threads are executed in a SIMD manner. Unfortunately, a complete GPU-specific implementation of all query operators has to be set up, since the state of the vectorized implementations cannot be ported from x86-processors to GPUs right now. To avoid this implementation effort, we present our vision to virtualize GPUs as virtual vector engines with software-defined SIMD instructions and to specialize hardware-oblivious vectorized operators to GPUs using our Template Vector Library (TVL) in this paper.

求助全文

通过发布文献求助，成功后即可免费获取论文全文。去求助

来源期刊

Proceedings of the 17th International Workshop on Data Management on New Hardware

自引率

0.00%

发文量