CCIndex for Cassandra: A Novel Scheme for Multi-dimensional Range Queries in Cassandra

Chen Feng, Yongqiang Zou, Zhiwei Xu
{"title":"CCIndex for Cassandra: A Novel Scheme for Multi-dimensional Range Queries in Cassandra","authors":"Chen Feng, Yongqiang Zou, Zhiwei Xu","doi":"10.1109/SKG.2011.28","DOIUrl":null,"url":null,"abstract":"Multi-dimensional range queries are fundamental requirements in large scale Internet applications using Distributed Ordered Tables. Apache Cassandra is a Distributed Ordered Table when it employs order-preserving hashing as data partitioner. Cassandra supports multi-dimensional range queries with poor performance and with a limitation that there must be one dimension with an equal operator. Based on the success of CCIndex scheme in Apache HBase, this paper tries to answer the question: Can CCIndex benefit multi-dimensional range queries in DOTs like Cassandra? This paper studies the feasibility of employing CCIndex in Cassandra, proposes a new approach to estimate result size, implements CCIndex in Cassandra including recovery mechanisms and studies the pros and cons of CCIndex for different DOTs. Experimental results show that CCIndex gains 2.4 to 3.7 times efficiency over Cassandra's index scheme with 1% to 50% selectivity for 2 million records. This paper shows that CCIndex is a general approach for DOTs, and could gain better performance for DOTs which perform scan tasks much faster than random read. This paper reveals that Cassandra is optimized for hash tables rather than ordered tables in performing read and range queries.","PeriodicalId":184788,"journal":{"name":"2011 Seventh International Conference on Semantics, Knowledge and Grids","volume":"197 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2011-10-24","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"20","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"2011 Seventh International Conference on Semantics, Knowledge and Grids","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/SKG.2011.28","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 20

Abstract

Multi-dimensional range queries are fundamental requirements in large scale Internet applications using Distributed Ordered Tables. Apache Cassandra is a Distributed Ordered Table when it employs order-preserving hashing as data partitioner. Cassandra supports multi-dimensional range queries with poor performance and with a limitation that there must be one dimension with an equal operator. Based on the success of CCIndex scheme in Apache HBase, this paper tries to answer the question: Can CCIndex benefit multi-dimensional range queries in DOTs like Cassandra? This paper studies the feasibility of employing CCIndex in Cassandra, proposes a new approach to estimate result size, implements CCIndex in Cassandra including recovery mechanisms and studies the pros and cons of CCIndex for different DOTs. Experimental results show that CCIndex gains 2.4 to 3.7 times efficiency over Cassandra's index scheme with 1% to 50% selectivity for 2 million records. This paper shows that CCIndex is a general approach for DOTs, and could gain better performance for DOTs which perform scan tasks much faster than random read. This paper reveals that Cassandra is optimized for hash tables rather than ordered tables in performing read and range queries.
CCIndex for Cassandra:一种新的Cassandra多维范围查询方案
多维范围查询是使用分布式有序表的大型Internet应用程序的基本需求。当Apache Cassandra使用保序哈希作为数据分区时,它是一个分布式有序表。Cassandra支持多维范围查询,但性能较差,并且有一个限制,即必须有一个具有相等操作符的维度。基于CCIndex方案在Apache HBase上的成功,本文试图回答这样一个问题:CCIndex是否能使像Cassandra这样的DOTs中的多维范围查询受益?本文研究了在Cassandra中使用CCIndex的可行性,提出了一种估算结果大小的新方法,在Cassandra中实现了包括恢复机制在内的CCIndex,并研究了CCIndex在不同DOTs中的优缺点。实验结果表明,CCIndex对200万条记录的选择性为1% ~ 50%,效率是Cassandra索引方案的2.4 ~ 3.7倍。本文表明,CCIndex是一种通用的DOTs方法,对于执行扫描任务的DOTs可以获得比随机读取更快的性能。本文揭示了Cassandra在执行读取和范围查询时针对哈希表而不是有序表进行了优化。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
求助全文
约1分钟内获得全文 求助全文
来源期刊
自引率
0.00%
发文量
0
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
确定
请完成安全验证×
copy
已复制链接
快去分享给好友吧!
我知道了
右上角分享
点击右上角分享
0
联系我们:info@booksci.cn Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。 Copyright © 2023 布克学术 All rights reserved.
京ICP备2023020795号-1
ghs 京公网安备 11010802042870号
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术官方微信