Wenting Yu, Yanming Shen, Keqiu Li, Junfeng Xu, Yong Li
{"title":"具有概率保证的分布式网络Top-k查询计算算法","authors":"Wenting Yu, Yanming Shen, Keqiu Li, Junfeng Xu, Yong Li","doi":"10.1109/WMWA.2009.82","DOIUrl":null,"url":null,"abstract":"Top-k queries based on ranking elements stop query processing when the top-k ranked results can be safely determined.There are two main methods for top-k query, accurate top-k query and approximate top-k query. However,existing top-k query consumes much bandwidth. Motivated by user’s goal to identify one or a few relevant data behind top-k query, it is attractive to use approximate top-k query algorithms to reduce the bandwidth usage. In this paper,we propose a three-phase approximate algorithm (TPAA),which is based on determining the value difference of the same object in different nodes. TPAA precuts the object whose values have big difference in different nodes. By precutting the illegitimate objects with a high probability,TPAA can reduce bandwidth consumption with high precision in some cases. It also supports probabilistic pruning of candidates, considerably reducing bandwidth usage at the expense of a small loss in precision of the top-k results.Furthermore, by performance evaluations using both theoretical analysis and computer simulations, we show that the proposed algorithm can reduce the bandwidth usage compared with existing probabilistic algorithms.","PeriodicalId":375180,"journal":{"name":"2009 Second Pacific-Asia Conference on Web Mining and Web-based Application","volume":"61 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2009-06-06","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":"{\"title\":\"A Top-k Query Calculation Algorithm in Distributed Networks with Probabilistic Guarantees\",\"authors\":\"Wenting Yu, Yanming Shen, Keqiu Li, Junfeng Xu, Yong Li\",\"doi\":\"10.1109/WMWA.2009.82\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"Top-k queries based on ranking elements stop query processing when the top-k ranked results can be safely determined.There are two main methods for top-k query, accurate top-k query and approximate top-k query. However,existing top-k query consumes much bandwidth. Motivated by user’s goal to identify one or a few relevant data behind top-k query, it is attractive to use approximate top-k query algorithms to reduce the bandwidth usage. In this paper,we propose a three-phase approximate algorithm (TPAA),which is based on determining the value difference of the same object in different nodes. TPAA precuts the object whose values have big difference in different nodes. By precutting the illegitimate objects with a high probability,TPAA can reduce bandwidth consumption with high precision in some cases. It also supports probabilistic pruning of candidates, considerably reducing bandwidth usage at the expense of a small loss in precision of the top-k results.Furthermore, by performance evaluations using both theoretical analysis and computer simulations, we show that the proposed algorithm can reduce the bandwidth usage compared with existing probabilistic algorithms.\",\"PeriodicalId\":375180,\"journal\":{\"name\":\"2009 Second Pacific-Asia Conference on Web Mining and Web-based Application\",\"volume\":\"61 1\",\"pages\":\"0\"},\"PeriodicalIF\":0.0000,\"publicationDate\":\"2009-06-06\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"0\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"2009 Second Pacific-Asia Conference on Web Mining and Web-based Application\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/10.1109/WMWA.2009.82\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"2009 Second Pacific-Asia Conference on Web Mining and Web-based Application","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/WMWA.2009.82","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
A Top-k Query Calculation Algorithm in Distributed Networks with Probabilistic Guarantees
Top-k queries based on ranking elements stop query processing when the top-k ranked results can be safely determined.There are two main methods for top-k query, accurate top-k query and approximate top-k query. However,existing top-k query consumes much bandwidth. Motivated by user’s goal to identify one or a few relevant data behind top-k query, it is attractive to use approximate top-k query algorithms to reduce the bandwidth usage. In this paper,we propose a three-phase approximate algorithm (TPAA),which is based on determining the value difference of the same object in different nodes. TPAA precuts the object whose values have big difference in different nodes. By precutting the illegitimate objects with a high probability,TPAA can reduce bandwidth consumption with high precision in some cases. It also supports probabilistic pruning of candidates, considerably reducing bandwidth usage at the expense of a small loss in precision of the top-k results.Furthermore, by performance evaluations using both theoretical analysis and computer simulations, we show that the proposed algorithm can reduce the bandwidth usage compared with existing probabilistic algorithms.