基于小区宽带引擎的矢量化数据处理

International Workshop on Data Management on New Hardware Pub Date : 2007-06-15 DOI:10.1145/1363189.1363195

S. Héman, N. Nes, M. Zukowski, P. Boncz

{"title":"基于小区宽带引擎的矢量化数据处理","authors":"S. Héman, N. Nes, M. Zukowski, P. Boncz","doi":"10.1145/1363189.1363195","DOIUrl":null,"url":null,"abstract":"In this work, we research the suitability of the Cell Broadband Engine for database processing. We start by outlining the main architectural features of Cell and use micro-benchmarks to characterize the latency and throughput of its memory infrastructure. Then, we discuss the challenges of porting RDBMS software to Cell: (i) all computations need to SIMD-ized, (ii) all performance-critical branches need to be eliminated, (iii) a very small and hard limit on program code size should be respected.\n While we argue that conventional database implementations, i.e. row-stores with Volcano-style tuple pipelining, are a hard fit to Cell, it turns out that the three challenges are quite easily met in databases that use column-wise processing. We managed to implement a proof-of-concept port of the vectorized query processing model of MonetDB/X100 on Cell by running the operator pipeline on the PowerPC, but having it execute the vectorized primitives (data parallel) on its SPE cores. A performance evaluation on TPC-H Q1 shows that vectorized query processing on Cell can beat conventional PowerPC and Itanium2 CPUs by a factor 20.","PeriodicalId":298901,"journal":{"name":"International Workshop on Data Management on New Hardware","volume":"31 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2007-06-15","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"51","resultStr":"{\"title\":\"Vectorized data processing on the cell broadband engine\",\"authors\":\"S. Héman, N. Nes, M. Zukowski, P. Boncz\",\"doi\":\"10.1145/1363189.1363195\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"In this work, we research the suitability of the Cell Broadband Engine for database processing. We start by outlining the main architectural features of Cell and use micro-benchmarks to characterize the latency and throughput of its memory infrastructure. Then, we discuss the challenges of porting RDBMS software to Cell: (i) all computations need to SIMD-ized, (ii) all performance-critical branches need to be eliminated, (iii) a very small and hard limit on program code size should be respected.\\n While we argue that conventional database implementations, i.e. row-stores with Volcano-style tuple pipelining, are a hard fit to Cell, it turns out that the three challenges are quite easily met in databases that use column-wise processing. We managed to implement a proof-of-concept port of the vectorized query processing model of MonetDB/X100 on Cell by running the operator pipeline on the PowerPC, but having it execute the vectorized primitives (data parallel) on its SPE cores. A performance evaluation on TPC-H Q1 shows that vectorized query processing on Cell can beat conventional PowerPC and Itanium2 CPUs by a factor 20.\",\"PeriodicalId\":298901,\"journal\":{\"name\":\"International Workshop on Data Management on New Hardware\",\"volume\":\"31 1\",\"pages\":\"0\"},\"PeriodicalIF\":0.0000,\"publicationDate\":\"2007-06-15\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"51\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"International Workshop on Data Management on New Hardware\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/10.1145/1363189.1363195\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"International Workshop on Data Management on New Hardware","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1145/1363189.1363195","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}

引用次数: 51

摘要

在这项工作中，我们研究了蜂窝宽带引擎对数据库处理的适用性。我们首先概述Cell的主要架构特性，并使用微基准测试来表征其内存基础设施的延迟和吞吐量。然后，我们讨论将RDBMS软件移植到Cell的挑战:(i)所有计算都需要simd化，(ii)所有性能关键分支都需要消除，(iii)应该尊重对程序代码大小的非常小的硬限制。虽然我们认为传统的数据库实现，例如使用volcano风格的元组管道的行存储，很难适合Cell，但事实证明，在使用列处理的数据库中，这三个挑战很容易满足。我们设法在Cell上实现了MonetDB/X100的向量化查询处理模型的概念验证端口，方法是在PowerPC上运行操作符管道，但让它在其SPE内核上执行向量化原语(数据并行)。对TPC-H Q1的性能评估表明，Cell上的向量化查询处理可以比传统的PowerPC和Itanium2 cpu高出20倍。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

查看原文本刊更多论文

Vectorized data processing on the cell broadband engine

In this work, we research the suitability of the Cell Broadband Engine for database processing. We start by outlining the main architectural features of Cell and use micro-benchmarks to characterize the latency and throughput of its memory infrastructure. Then, we discuss the challenges of porting RDBMS software to Cell: (i) all computations need to SIMD-ized, (ii) all performance-critical branches need to be eliminated, (iii) a very small and hard limit on program code size should be respected. While we argue that conventional database implementations, i.e. row-stores with Volcano-style tuple pipelining, are a hard fit to Cell, it turns out that the three challenges are quite easily met in databases that use column-wise processing. We managed to implement a proof-of-concept port of the vectorized query processing model of MonetDB/X100 on Cell by running the operator pipeline on the PowerPC, but having it execute the vectorized primitives (data parallel) on its SPE cores. A performance evaluation on TPC-H Q1 shows that vectorized query processing on Cell can beat conventional PowerPC and Itanium2 CPUs by a factor 20.

求助全文

通过发布文献求助，成功后即可免费获取论文全文。去求助

来源期刊

International Workshop on Data Management on New Hardware

自引率

0.00%

发文量