Huu Vu Lam Cao, T. Phan, Q. Minh, Thanh Luan Hong, M. S. Q. Truong
{"title":"Processing All k-Nearest Neighbor Query on Large Multidimensional Data","authors":"Huu Vu Lam Cao, T. Phan, Q. Minh, Thanh Luan Hong, M. S. Q. Truong","doi":"10.1109/ACOMP.2016.012","DOIUrl":null,"url":null,"abstract":"All k nearest neighbor (AkNN) query processing is a data processing problem which is important in many fields such as computer architecture, searching user information by coordinates, and city planning. Nowadays amount of data tends to grow in size and becomes huge. It is a major challenge that we need to face. Therefore, many traditional methods are no longer effective when dealing with the problem. Meanwhile, the method that processes distributed and parallel AkNN problem on MapReduce model and uses equal-cell-dividing technique is effective on multidimensional large dataset. However, when data is not equally distributed, the method becomes inefficient and even cannot be implemented. In this paper, we improve this method by applying a new cell-dividing technique. Instead of dividing the target space into cells which have the same size, we aim to divide it to cells in which the number of points are balanced, and there is not a cell that contains a large number of points. We also conduct experiments and compare the results produced by the old method and our method. Experimental results show that our method is more efficient and more stable.","PeriodicalId":133451,"journal":{"name":"2016 International Conference on Advanced Computing and Applications (ACOMP)","volume":"202 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2016-11-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"1","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"2016 International Conference on Advanced Computing and Applications (ACOMP)","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/ACOMP.2016.012","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 1
Abstract
All k nearest neighbor (AkNN) query processing is a data processing problem which is important in many fields such as computer architecture, searching user information by coordinates, and city planning. Nowadays amount of data tends to grow in size and becomes huge. It is a major challenge that we need to face. Therefore, many traditional methods are no longer effective when dealing with the problem. Meanwhile, the method that processes distributed and parallel AkNN problem on MapReduce model and uses equal-cell-dividing technique is effective on multidimensional large dataset. However, when data is not equally distributed, the method becomes inefficient and even cannot be implemented. In this paper, we improve this method by applying a new cell-dividing technique. Instead of dividing the target space into cells which have the same size, we aim to divide it to cells in which the number of points are balanced, and there is not a cell that contains a large number of points. We also conduct experiments and compare the results produced by the old method and our method. Experimental results show that our method is more efficient and more stable.