{"title":"Top-k Skyline Result Optimization Algorithm in MapReduce","authors":"Aili Liu","doi":"10.1109/ICCSE.2019.8845361","DOIUrl":null,"url":null,"abstract":"Skyline is widely used in multi-objective decision-making, data visualization and other fields. With the rapid increasing of data volume, skyline of big data has also attracted more and more attention. However, skyline of big data has its own shortcomings. When the dimension increases, skyline results will be numerous, and we would like to select k points from the result sets. In this paper, we propose the top-k skyline of big data. It is a Top-k Skyline Method in MapReduce, called MR-DMKS. Firstly, we convert the multidimensional data to a single value to determine the dominance relationship of two data points. Secondly, we calculate the score by using the converted values. Thirdly, sort the data points more efficiently and accurately according to the scores using a window queue. Finally, we choose k data objects having the strongest dominating capacity. A large number of experiments show that our method is effective, and has good flexibility and scalability on real data sets as well as synthetic data sets.","PeriodicalId":351346,"journal":{"name":"2019 14th International Conference on Computer Science & Education (ICCSE)","volume":"1 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2019-08-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"2019 14th International Conference on Computer Science & Education (ICCSE)","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/ICCSE.2019.8845361","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 0
Abstract
Skyline is widely used in multi-objective decision-making, data visualization and other fields. With the rapid increasing of data volume, skyline of big data has also attracted more and more attention. However, skyline of big data has its own shortcomings. When the dimension increases, skyline results will be numerous, and we would like to select k points from the result sets. In this paper, we propose the top-k skyline of big data. It is a Top-k Skyline Method in MapReduce, called MR-DMKS. Firstly, we convert the multidimensional data to a single value to determine the dominance relationship of two data points. Secondly, we calculate the score by using the converted values. Thirdly, sort the data points more efficiently and accurately according to the scores using a window queue. Finally, we choose k data objects having the strongest dominating capacity. A large number of experiments show that our method is effective, and has good flexibility and scalability on real data sets as well as synthetic data sets.