Multi-query Optimization for Distributed Similarity Query Processing

2008 The 28th International Conference on Distributed Computing Systems Pub Date : 2008-06-17 DOI:10.1109/ICDCS.2008.58

Zhuang Yi, Qing Li, Lei Chen

{"title":"Multi-query Optimization for Distributed Similarity Query Processing","authors":"Zhuang Yi, Qing Li, Lei Chen","doi":"10.1109/ICDCS.2008.58","DOIUrl":null,"url":null,"abstract":"This paper considers a multi-query optimization issue for distributed similarity query processing, which attempts to exploit the dependencies in the derivation of a query evaluation plan. To the best of our knowledge, this is the first work investigating a multi- query optimization technique for distributed similarity query processing (MDSQ). Four steps are incorporated in our MDSQ algorithm. First when a number of query requests(i.e., m query vectors and m radiuses) are simultaneously submitted by users, then a cost-based dynamic query scheduling(DQS) procedure is invoked to quickly and effectively identify the correlation among the query spheres (requests). After that, an index-based vector set reduction is performed at data node level in parallel. Finally, a refinement process of the candidate vectors is conducted to get the answer set. The proposed method includes a cost-based dynamic query scheduling, a Start-Distance(SD)-based load balancing scheme, and an index-based vector set reduction algorithm. The experimental results validate the efficiency and effectiveness of the algorithm in minimizing the response time and increasing the parallelism of I/O and CPU.","PeriodicalId":240205,"journal":{"name":"2008 The 28th International Conference on Distributed Computing Systems","volume":"1 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2008-06-17","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"11","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"2008 The 28th International Conference on Distributed Computing Systems","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/ICDCS.2008.58","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}

引用次数: 11

Abstract

This paper considers a multi-query optimization issue for distributed similarity query processing, which attempts to exploit the dependencies in the derivation of a query evaluation plan. To the best of our knowledge, this is the first work investigating a multi- query optimization technique for distributed similarity query processing (MDSQ). Four steps are incorporated in our MDSQ algorithm. First when a number of query requests(i.e., m query vectors and m radiuses) are simultaneously submitted by users, then a cost-based dynamic query scheduling(DQS) procedure is invoked to quickly and effectively identify the correlation among the query spheres (requests). After that, an index-based vector set reduction is performed at data node level in parallel. Finally, a refinement process of the candidate vectors is conducted to get the answer set. The proposed method includes a cost-based dynamic query scheduling, a Start-Distance(SD)-based load balancing scheme, and an index-based vector set reduction algorithm. The experimental results validate the efficiency and effectiveness of the algorithm in minimizing the response time and increasing the parallelism of I/O and CPU.

查看原文本刊更多论文

分布式相似查询处理的多查询优化

本文研究了分布式相似查询处理中的多查询优化问题，该问题试图利用查询评估计划派生过程中的依赖关系。据我们所知，这是第一个研究分布式相似查询处理(MDSQ)的多查询优化技术的工作。我们的MDSQ算法包含四个步骤。首先，当大量的查询请求(例如:用户同时提交m个查询向量和m个半径)，然后调用基于成本的动态查询调度(DQS)过程来快速有效地识别查询域(请求)之间的相关性。之后，在数据节点级别并行执行基于索引的向量集约简。最后，对候选向量进行细化处理，得到答案集。该方法包括基于成本的动态查询调度、基于起始距离(SD)的负载均衡方案和基于索引的向量集约简算法。实验结果验证了该算法在最小化响应时间和提高I/O和CPU并行性方面的效率和有效性。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

求助全文

约1分钟内获得全文求助全文

来源期刊

2008 The 28th International Conference on Distributed Computing Systems

自引率

0.00%

发文量