T. Phan, Markus Jäger, Stefan Nadschläger, J. Küng
{"title":"支持大数据相似性搜索的基于范围的聚类","authors":"T. Phan, Markus Jäger, Stefan Nadschläger, J. Küng","doi":"10.1109/DEXA.2015.41","DOIUrl":null,"url":null,"abstract":"Thanks to state-of-the-art technologies, we have more and more modern infrastructures as well as automatic processes supporting the agricultural domain. Data collected from parcels by these systems and remote sensors for further analysis result in facing the three main challenges which are known as big volume, big variety, and big velocity, in the era of big data. In terms of similarity search, we propose a range-based clustering method that finds objects which are the most similar compared to the given object in a large-scale computing with Map Reduce. The proposed method groups objects into different clusters which are considered as pivots to perform pre-checking before computing similarity. Furthermore, we conduct some basic experiments to evaluate the performance of the proposed method and observe the influences of the clusters in similarity search.","PeriodicalId":239815,"journal":{"name":"2015 26th International Workshop on Database and Expert Systems Applications (DEXA)","volume":"1 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2015-09-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"3","resultStr":"{\"title\":\"Range-Based Clustering Supporting Similarity Search in Big Data\",\"authors\":\"T. Phan, Markus Jäger, Stefan Nadschläger, J. Küng\",\"doi\":\"10.1109/DEXA.2015.41\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"Thanks to state-of-the-art technologies, we have more and more modern infrastructures as well as automatic processes supporting the agricultural domain. Data collected from parcels by these systems and remote sensors for further analysis result in facing the three main challenges which are known as big volume, big variety, and big velocity, in the era of big data. In terms of similarity search, we propose a range-based clustering method that finds objects which are the most similar compared to the given object in a large-scale computing with Map Reduce. The proposed method groups objects into different clusters which are considered as pivots to perform pre-checking before computing similarity. Furthermore, we conduct some basic experiments to evaluate the performance of the proposed method and observe the influences of the clusters in similarity search.\",\"PeriodicalId\":239815,\"journal\":{\"name\":\"2015 26th International Workshop on Database and Expert Systems Applications (DEXA)\",\"volume\":\"1 1\",\"pages\":\"0\"},\"PeriodicalIF\":0.0000,\"publicationDate\":\"2015-09-01\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"3\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"2015 26th International Workshop on Database and Expert Systems Applications (DEXA)\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/10.1109/DEXA.2015.41\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"2015 26th International Workshop on Database and Expert Systems Applications (DEXA)","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/DEXA.2015.41","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
Range-Based Clustering Supporting Similarity Search in Big Data
Thanks to state-of-the-art technologies, we have more and more modern infrastructures as well as automatic processes supporting the agricultural domain. Data collected from parcels by these systems and remote sensors for further analysis result in facing the three main challenges which are known as big volume, big variety, and big velocity, in the era of big data. In terms of similarity search, we propose a range-based clustering method that finds objects which are the most similar compared to the given object in a large-scale computing with Map Reduce. The proposed method groups objects into different clusters which are considered as pivots to perform pre-checking before computing similarity. Furthermore, we conduct some basic experiments to evaluate the performance of the proposed method and observe the influences of the clusters in similarity search.