Marcos R. Vieira, C. Traina, A. Traina, Adriano S. Arantes, C. Faloutsos
{"title":"提高k近邻查询估计合适的查询半径","authors":"Marcos R. Vieira, C. Traina, A. Traina, Adriano S. Arantes, C. Faloutsos","doi":"10.1109/SSDBM.2007.5","DOIUrl":null,"url":null,"abstract":"This paper proposes novel and effective techniques to estimate a radius to answer k-nearest neighbor queries. The first technique targets datasets where it is possible to learn the distribution about the pairwise distances between the elements, generating a global estimation that applies to the whole dataset. The second technique targets datasets where the first technique cannot be employed, generating estimations that depend on where the query center is located. The proposed k-NNF() algorithm combines both techniques, achieving remarkable speedups. Experiments performed on both real and synthetic datasets have shown that the proposed algorithm can accelerate k-NN queries more than 26 times compared with the incremental algorithm and spends half of the total time compared with the traditional k-NN() algorithms.","PeriodicalId":122925,"journal":{"name":"19th International Conference on Scientific and Statistical Database Management (SSDBM 2007)","volume":"2 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2007-07-09","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"10","resultStr":"{\"title\":\"Boosting k-Nearest Neighbor Queries Estimating Suitable Query Radii\",\"authors\":\"Marcos R. Vieira, C. Traina, A. Traina, Adriano S. Arantes, C. Faloutsos\",\"doi\":\"10.1109/SSDBM.2007.5\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"This paper proposes novel and effective techniques to estimate a radius to answer k-nearest neighbor queries. The first technique targets datasets where it is possible to learn the distribution about the pairwise distances between the elements, generating a global estimation that applies to the whole dataset. The second technique targets datasets where the first technique cannot be employed, generating estimations that depend on where the query center is located. The proposed k-NNF() algorithm combines both techniques, achieving remarkable speedups. Experiments performed on both real and synthetic datasets have shown that the proposed algorithm can accelerate k-NN queries more than 26 times compared with the incremental algorithm and spends half of the total time compared with the traditional k-NN() algorithms.\",\"PeriodicalId\":122925,\"journal\":{\"name\":\"19th International Conference on Scientific and Statistical Database Management (SSDBM 2007)\",\"volume\":\"2 1\",\"pages\":\"0\"},\"PeriodicalIF\":0.0000,\"publicationDate\":\"2007-07-09\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"10\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"19th International Conference on Scientific and Statistical Database Management (SSDBM 2007)\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/10.1109/SSDBM.2007.5\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"19th International Conference on Scientific and Statistical Database Management (SSDBM 2007)","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/SSDBM.2007.5","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
This paper proposes novel and effective techniques to estimate a radius to answer k-nearest neighbor queries. The first technique targets datasets where it is possible to learn the distribution about the pairwise distances between the elements, generating a global estimation that applies to the whole dataset. The second technique targets datasets where the first technique cannot be employed, generating estimations that depend on where the query center is located. The proposed k-NNF() algorithm combines both techniques, achieving remarkable speedups. Experiments performed on both real and synthetic datasets have shown that the proposed algorithm can accelerate k-NN queries more than 26 times compared with the incremental algorithm and spends half of the total time compared with the traditional k-NN() algorithms.