gpu加速器能显著提高集群平台上大量保守DBMS的效率吗?

2017 International Siberian Conference on Control and Communications (SIBCON) Pub Date : 2017-06-01 DOI:10.1109/SIBCON.2017.7998474

V. Raikhlin, R. K. Klassen

{"title":"gpu加速器能显著提高集群平台上大量保守DBMS的效率吗?","authors":"V. Raikhlin, R. K. Klassen","doi":"10.1109/SIBCON.2017.7998474","DOIUrl":null,"url":null,"abstract":"Discusses the issues of building a conservative type DBMS (with episodic data updates in a specially allotted time) on the platform of GPU-clusters at scale databases — V<inf>db</inf> at least 100GB. Their relevance is determined by modern tendencies of intelligent processing of large data arrays using graphics accelerators — GPU. By the condition the query processing is carried out on a regular plan. At the nodes of a cluster running MySQL function multi-core processors (2 processors per host) with a full load all cores. In the dynamics of query processing nodal database is in main memory of node with capacity of up to 128 GB. Problems occur due to the relatively small volumes of GPU global memory and speed of transmission data at the PCI-e, which connect CPU and GPU. The cases of average V<inf>db</inf> — near the 100GB, replicated across nodes, and large enough V<inf>db</inf> — from the hundreds GB to units of TB, hashed on the set of nodes. In the first case analyzed two variants of the DBMS functioning: 1) on the CPU — operations «select-project», on the GPU — «join» 2) on the CPU — «project» and «join», on the GPU — «select», DB is stored in compressed form. It was found that both variants use accelerators uncompetitive. A possible solution — the development of specialized DBMS with query optimization, which focused on the use of the GPU. In the second case, the operations «select-project» performed on the executive nodes with accelerators (compressed DB) or without (uncompressed DB, V<inf>db</inf> — hundreds GB), operations «join» — on the JOIN node without additional GPU. The answer to the question, can we expect in that case Increase of overall performance with the database compared to the previously developed DBMS Clusterix, still remains open.","PeriodicalId":190182,"journal":{"name":"2017 International Siberian Conference on Control and Communications (SIBCON)","volume":"6 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2017-06-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"1","resultStr":"{\"title\":\"Can GPU-accelerator significantly increase the effectiveness of conservative DBMS considerable volumes on cluster platforms?\",\"authors\":\"V. Raikhlin, R. K. Klassen\",\"doi\":\"10.1109/SIBCON.2017.7998474\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"Discusses the issues of building a conservative type DBMS (with episodic data updates in a specially allotted time) on the platform of GPU-clusters at scale databases — V<inf>db</inf> at least 100GB. Their relevance is determined by modern tendencies of intelligent processing of large data arrays using graphics accelerators — GPU. By the condition the query processing is carried out on a regular plan. At the nodes of a cluster running MySQL function multi-core processors (2 processors per host) with a full load all cores. In the dynamics of query processing nodal database is in main memory of node with capacity of up to 128 GB. Problems occur due to the relatively small volumes of GPU global memory and speed of transmission data at the PCI-e, which connect CPU and GPU. The cases of average V<inf>db</inf> — near the 100GB, replicated across nodes, and large enough V<inf>db</inf> — from the hundreds GB to units of TB, hashed on the set of nodes. In the first case analyzed two variants of the DBMS functioning: 1) on the CPU — operations «select-project», on the GPU — «join» 2) on the CPU — «project» and «join», on the GPU — «select», DB is stored in compressed form. It was found that both variants use accelerators uncompetitive. A possible solution — the development of specialized DBMS with query optimization, which focused on the use of the GPU. In the second case, the operations «select-project» performed on the executive nodes with accelerators (compressed DB) or without (uncompressed DB, V<inf>db</inf> — hundreds GB), operations «join» — on the JOIN node without additional GPU. The answer to the question, can we expect in that case Increase of overall performance with the database compared to the previously developed DBMS Clusterix, still remains open.\",\"PeriodicalId\":190182,\"journal\":{\"name\":\"2017 International Siberian Conference on Control and Communications (SIBCON)\",\"volume\":\"6 1\",\"pages\":\"0\"},\"PeriodicalIF\":0.0000,\"publicationDate\":\"2017-06-01\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"1\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"2017 International Siberian Conference on Control and Communications (SIBCON)\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/10.1109/SIBCON.2017.7998474\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"2017 International Siberian Conference on Control and Communications (SIBCON)","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/SIBCON.2017.7998474","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}

引用次数: 1

摘要

讨论了在大规模数据库(Vdb至少100GB)的gpu集群平台上构建保守型DBMS(在特定分配的时间内进行偶发性数据更新)的问题。它们的相关性是由使用图形加速器(GPU)对大型数据阵列进行智能处理的现代趋势决定的。根据该条件，查询处理按规则计划执行。在一个运行MySQL功能的集群节点上，多核处理器(每台主机2个处理器)满载所有核心。在动态查询处理中，节点数据库位于节点主内存中，容量高达128gb。出现问题的原因是GPU全局内存容量较小，连接CPU和GPU的PCI-e传输数据的速度较快。平均Vdb的情况——接近100GB，跨节点复制，并且足够大的Vdb——从数百GB到TB单位，在节点集上散列。在第一个案例中，分析了DBMS功能的两种变体:1)在CPU上-操作“选择-项目”，在GPU上-“加入”2)在CPU上-“项目”和“加入”，在GPU上-“选择”，DB以压缩形式存储。研究发现，两种使用加速器的变体都没有竞争力。一个可能的解决方案是开发具有查询优化功能的专用DBMS，其重点是GPU的使用。在第二种情况下，在带有加速器(压缩DB)或不带有加速器(未压缩DB, Vdb -数百GB)的执行节点上执行«select-project»操作，在没有额外GPU的join节点上执行«join»操作。在这种情况下，与之前开发的DBMS Clusterix相比，我们是否可以期望数据库的整体性能有所提高，这个问题的答案仍然是开放的。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

查看原文本刊更多论文

Can GPU-accelerator significantly increase the effectiveness of conservative DBMS considerable volumes on cluster platforms?

Discusses the issues of building a conservative type DBMS (with episodic data updates in a specially allotted time) on the platform of GPU-clusters at scale databases — Vdb at least 100GB. Their relevance is determined by modern tendencies of intelligent processing of large data arrays using graphics accelerators — GPU. By the condition the query processing is carried out on a regular plan. At the nodes of a cluster running MySQL function multi-core processors (2 processors per host) with a full load all cores. In the dynamics of query processing nodal database is in main memory of node with capacity of up to 128 GB. Problems occur due to the relatively small volumes of GPU global memory and speed of transmission data at the PCI-e, which connect CPU and GPU. The cases of average Vdb — near the 100GB, replicated across nodes, and large enough Vdb — from the hundreds GB to units of TB, hashed on the set of nodes. In the first case analyzed two variants of the DBMS functioning: 1) on the CPU — operations «select-project», on the GPU — «join» 2) on the CPU — «project» and «join», on the GPU — «select», DB is stored in compressed form. It was found that both variants use accelerators uncompetitive. A possible solution — the development of specialized DBMS with query optimization, which focused on the use of the GPU. In the second case, the operations «select-project» performed on the executive nodes with accelerators (compressed DB) or without (uncompressed DB, Vdb — hundreds GB), operations «join» — on the JOIN node without additional GPU. The answer to the question, can we expect in that case Increase of overall performance with the database compared to the previously developed DBMS Clusterix, still remains open.

求助全文

通过发布文献求助，成功后即可免费获取论文全文。去求助

来源期刊

2017 International Siberian Conference on Control and Communications (SIBCON)

自引率

0.00%

发文量