GPU学习索引

2022 IEEE 38th International Conference on Data Engineering Workshops (ICDEW) Pub Date : 2022-05-01 DOI:10.1109/icdew55742.2022.00024

Xun Zhong, Yong Zhang, Yu Chen, Chao Li, Chunxiao Xing

{"title":"GPU学习索引","authors":"Xun Zhong, Yong Zhang, Yu Chen, Chao Li, Chunxiao Xing","doi":"10.1109/icdew55742.2022.00024","DOIUrl":null,"url":null,"abstract":"Index is a key structure created to quickly access specific information in database. Recent research on “learned indexes” has received extensive attention. The key idea is that index can be regarded as a model that maps keys to specific locations in data sets, so the traditional index structure can be replaced by machine learning models. Current learned indexes universally gain higher time efficiency and occupy smaller space than traditional indexes, but their query efficiency and concurrency are limited by CPU. GPU is widely used in computing intensive tasks because of its unique architecture and powerful computing ability. According to the research on learned index in recent years, we propose a new trait of thought to combine the advantages of GPU and learned index, which puts learned index in GPU memory and makes full use of the high concurrency and computing power of GPU. We implement the PGM-index on GPU and conduct an extensive set of experiments on several real-life and synthetic datasets. The results demonstrate that our method beats the original learned index on CPU by up to 20× for static workloads when query scale is large.","PeriodicalId":429378,"journal":{"name":"2022 IEEE 38th International Conference on Data Engineering Workshops (ICDEW)","volume":"62 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2022-05-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"2","resultStr":"{\"title\":\"Learned Index on GPU\",\"authors\":\"Xun Zhong, Yong Zhang, Yu Chen, Chao Li, Chunxiao Xing\",\"doi\":\"10.1109/icdew55742.2022.00024\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"Index is a key structure created to quickly access specific information in database. Recent research on “learned indexes” has received extensive attention. The key idea is that index can be regarded as a model that maps keys to specific locations in data sets, so the traditional index structure can be replaced by machine learning models. Current learned indexes universally gain higher time efficiency and occupy smaller space than traditional indexes, but their query efficiency and concurrency are limited by CPU. GPU is widely used in computing intensive tasks because of its unique architecture and powerful computing ability. According to the research on learned index in recent years, we propose a new trait of thought to combine the advantages of GPU and learned index, which puts learned index in GPU memory and makes full use of the high concurrency and computing power of GPU. We implement the PGM-index on GPU and conduct an extensive set of experiments on several real-life and synthetic datasets. The results demonstrate that our method beats the original learned index on CPU by up to 20× for static workloads when query scale is large.\",\"PeriodicalId\":429378,\"journal\":{\"name\":\"2022 IEEE 38th International Conference on Data Engineering Workshops (ICDEW)\",\"volume\":\"62 1\",\"pages\":\"0\"},\"PeriodicalIF\":0.0000,\"publicationDate\":\"2022-05-01\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"2\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"2022 IEEE 38th International Conference on Data Engineering Workshops (ICDEW)\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/10.1109/icdew55742.2022.00024\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"2022 IEEE 38th International Conference on Data Engineering Workshops (ICDEW)","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/icdew55742.2022.00024","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}

引用次数: 2

摘要

索引是为了快速访问数据库中的特定信息而创建的关键结构。近年来对“学习指标”的研究受到了广泛关注。关键思想是，索引可以被看作是一个将键映射到数据集中特定位置的模型，因此传统的索引结构可以被机器学习模型所取代。目前的学习索引普遍比传统索引具有更高的时间效率和更小的空间占用，但其查询效率和并发性受到CPU的限制。GPU以其独特的架构和强大的计算能力被广泛应用于计算密集型任务中。根据近年来对学习索引的研究，我们提出了一种结合GPU和学习索引优点的新思路，将学习索引放在GPU内存中，充分利用GPU的高并发性和计算能力。我们在GPU上实现了pgm索引，并在几个真实数据集和合成数据集上进行了广泛的实验。结果表明，当查询规模较大时，我们的方法在静态工作负载上比原始的CPU学习索引高出20倍。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

查看原文本刊更多论文

Learned Index on GPU

Index is a key structure created to quickly access specific information in database. Recent research on “learned indexes” has received extensive attention. The key idea is that index can be regarded as a model that maps keys to specific locations in data sets, so the traditional index structure can be replaced by machine learning models. Current learned indexes universally gain higher time efficiency and occupy smaller space than traditional indexes, but their query efficiency and concurrency are limited by CPU. GPU is widely used in computing intensive tasks because of its unique architecture and powerful computing ability. According to the research on learned index in recent years, we propose a new trait of thought to combine the advantages of GPU and learned index, which puts learned index in GPU memory and makes full use of the high concurrency and computing power of GPU. We implement the PGM-index on GPU and conduct an extensive set of experiments on several real-life and synthetic datasets. The results demonstrate that our method beats the original learned index on CPU by up to 20× for static workloads when query scale is large.

求助全文

通过发布文献求助，成功后即可免费获取论文全文。去求助

来源期刊

2022 IEEE 38th International Conference on Data Engineering Workshops (ICDEW)

自引率

0.00%

发文量