Near Data Filtering for Distributed Database Systems

Zimeng Zhou, Xuan Sun, Jinghuan Yu, Sarana Nutanong, C. Xue
{"title":"Near Data Filtering for Distributed Database Systems","authors":"Zimeng Zhou, Xuan Sun, Jinghuan Yu, Sarana Nutanong, C. Xue","doi":"10.1109/IGCC.2018.8752112","DOIUrl":null,"url":null,"abstract":"Over the past decade, data movement costs dominate the execution time of data-intensive applications for distributed systems and they are expected to be even more important in the future. Near data processing is a straightforward solution to reduce data movement which brings compute resources closer to the data source. This paper explores near data processing in a generic distributed system to improve the performance by reducing data movement. An efficient near data filtering solution is designed and implemented by introducing a filter layer which performs tuple-level near data filtering. In order to reduce idle time of processing nodes and improve data transmission throughput the proposed solution is extended to support block-level near data filtering by creating index for each data block. Furthermore, to answer the question when and how to perform near data filtering this paper proposes an adaptive near data filtering solution to balance the computation and data transmission throughput. Experimental results show that the proposed solutions are superior to the best existing method for most cases. The adaptive near data filtering solution achieves an average speedup factor of 4:59 for queries with low selectivity.","PeriodicalId":388554,"journal":{"name":"2018 Ninth International Green and Sustainable Computing Conference (IGSC)","volume":"79 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2018-10-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"2018 Ninth International Green and Sustainable Computing Conference (IGSC)","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/IGCC.2018.8752112","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 0

Abstract

Over the past decade, data movement costs dominate the execution time of data-intensive applications for distributed systems and they are expected to be even more important in the future. Near data processing is a straightforward solution to reduce data movement which brings compute resources closer to the data source. This paper explores near data processing in a generic distributed system to improve the performance by reducing data movement. An efficient near data filtering solution is designed and implemented by introducing a filter layer which performs tuple-level near data filtering. In order to reduce idle time of processing nodes and improve data transmission throughput the proposed solution is extended to support block-level near data filtering by creating index for each data block. Furthermore, to answer the question when and how to perform near data filtering this paper proposes an adaptive near data filtering solution to balance the computation and data transmission throughput. Experimental results show that the proposed solutions are superior to the best existing method for most cases. The adaptive near data filtering solution achieves an average speedup factor of 4:59 for queries with low selectivity.
分布式数据库系统的近数据过滤
在过去的十年中,数据移动成本在分布式系统的数据密集型应用程序的执行时间中占主导地位,并且预计在未来会更加重要。近数据处理是一种减少数据移动的直接解决方案,它使计算资源更接近数据源。本文探讨了通用分布式系统中的近数据处理,通过减少数据移动来提高系统性能。设计并实现了一种高效的近数据过滤方案,该方案通过引入一个执行双级近数据过滤的过滤层来实现。为了减少处理节点的空闲时间,提高数据传输吞吐量,扩展了该方案,通过为每个数据块创建索引来支持块级近数据过滤。此外,针对何时以及如何进行近数据滤波的问题,本文提出了一种自适应近数据滤波解决方案,以平衡计算量和数据传输吞吐量。实验结果表明,在大多数情况下,本文提出的方法优于现有的最佳方法。对于低选择性的查询,自适应近数据过滤解决方案的平均加速系数为4:59。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
求助全文
约1分钟内获得全文 求助全文
来源期刊
自引率
0.00%
发文量
0
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
确定
请完成安全验证×
copy
已复制链接
快去分享给好友吧!
我知道了
右上角分享
点击右上角分享
0
联系我们:info@booksci.cn Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。 Copyright © 2023 布克学术 All rights reserved.
京ICP备2023020795号-1
ghs 京公网安备 11010802042870号
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术官方微信