基于列存储的数据重用策略

Mei Wang, Jiaoling Zhou, Yue Li, Xiaoling Xia, Jiajin Le
{"title":"基于列存储的数据重用策略","authors":"Mei Wang, Jiaoling Zhou, Yue Li, Xiaoling Xia, Jiajin Le","doi":"10.1109/DASC.2013.56","DOIUrl":null,"url":null,"abstract":"Data reusing is an important way to save storage capacity and improve query efficiency in the management of massive data. The column-store architecture stores data from the same column continuously, which greatly improves the performance of 'read optimization' application and moreover increases the feasibility and flexibility of data reusing. In this paper, we propose a novel reusing method based on the column-store data warehouse. Firstly, we propose an improved iMAP method based on the schema mapping technique to generate as more candidate reusable columns as possible and then conduct further filter on these candidate data, which greatly reduces the complexity of reusable data detection. Based on the column-store architecture, we then propose the reuse implement at the storage layer. The method for query execution based on reusable data is provided finally. The experiment results conducted on the real data sets indicate that the presented strategy can reduce the storage space and query execution time efficiently.","PeriodicalId":179557,"journal":{"name":"2013 IEEE 11th International Conference on Dependable, Autonomic and Secure Computing","volume":"24 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2013-12-21","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":"{\"title\":\"A Data Reusing Strategy Based on Column-Stores\",\"authors\":\"Mei Wang, Jiaoling Zhou, Yue Li, Xiaoling Xia, Jiajin Le\",\"doi\":\"10.1109/DASC.2013.56\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"Data reusing is an important way to save storage capacity and improve query efficiency in the management of massive data. The column-store architecture stores data from the same column continuously, which greatly improves the performance of 'read optimization' application and moreover increases the feasibility and flexibility of data reusing. In this paper, we propose a novel reusing method based on the column-store data warehouse. Firstly, we propose an improved iMAP method based on the schema mapping technique to generate as more candidate reusable columns as possible and then conduct further filter on these candidate data, which greatly reduces the complexity of reusable data detection. Based on the column-store architecture, we then propose the reuse implement at the storage layer. The method for query execution based on reusable data is provided finally. The experiment results conducted on the real data sets indicate that the presented strategy can reduce the storage space and query execution time efficiently.\",\"PeriodicalId\":179557,\"journal\":{\"name\":\"2013 IEEE 11th International Conference on Dependable, Autonomic and Secure Computing\",\"volume\":\"24 1\",\"pages\":\"0\"},\"PeriodicalIF\":0.0000,\"publicationDate\":\"2013-12-21\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"0\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"2013 IEEE 11th International Conference on Dependable, Autonomic and Secure Computing\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/10.1109/DASC.2013.56\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"2013 IEEE 11th International Conference on Dependable, Autonomic and Secure Computing","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/DASC.2013.56","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 0

摘要

在海量数据管理中,数据重用是节省存储容量、提高查询效率的重要手段。列存储结构将同一列的数据连续存储,大大提高了“读优化”应用程序的性能,增加了数据重用的可行性和灵活性。本文提出了一种基于列存储数据仓库的重用方法。首先,我们提出了一种基于模式映射技术的改进的iMAP方法,尽可能多地生成候选可重用列,然后对这些候选数据进行进一步的过滤,从而大大降低了可重用数据检测的复杂性。基于列存储体系结构,提出了存储层的重用实现。最后给出了基于可重用数据的查询执行方法。在实际数据集上的实验结果表明,该策略可以有效地减少存储空间和查询执行时间。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
A Data Reusing Strategy Based on Column-Stores
Data reusing is an important way to save storage capacity and improve query efficiency in the management of massive data. The column-store architecture stores data from the same column continuously, which greatly improves the performance of 'read optimization' application and moreover increases the feasibility and flexibility of data reusing. In this paper, we propose a novel reusing method based on the column-store data warehouse. Firstly, we propose an improved iMAP method based on the schema mapping technique to generate as more candidate reusable columns as possible and then conduct further filter on these candidate data, which greatly reduces the complexity of reusable data detection. Based on the column-store architecture, we then propose the reuse implement at the storage layer. The method for query execution based on reusable data is provided finally. The experiment results conducted on the real data sets indicate that the presented strategy can reduce the storage space and query execution time efficiently.
求助全文
通过发布文献求助,成功后即可免费获取论文全文。 去求助
来源期刊
自引率
0.00%
发文量
0
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
确定
请完成安全验证×
copy
已复制链接
快去分享给好友吧!
我知道了
右上角分享
点击右上角分享
0
联系我们:info@booksci.cn Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。 Copyright © 2023 布克学术 All rights reserved.
京ICP备2023020795号-1
ghs 京公网安备 11010802042870号
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术官方微信