{"title":"一种利用pTree技术计算平方欧氏距离的新方法及其性能分析","authors":"Mohammad Hossain, Sameer Abufardeh","doi":"10.29007/TRRG","DOIUrl":null,"url":null,"abstract":"One of the advantages of Euclidean distance is that it measures the regular distance between two points in space. For this reason, it is widely used in the applications where the distance between data points are needed to be calculated to measure similarities. However, this method is costly as there involve expensive square and square root operations. One useful observation is that in many data mining applications absolute distance measures are not necessary as long as the distances are used to compare the closeness between various data points. For example, in classification and clustering, we often measure the distances of multiple data points to compare their distances from known classes or from centroids to assign those points in a class or in a cluster. In this regards, an alternative approach known as Squared Euclidean Distance (SED) can be used to avoid the computation of square root to get the squared distance between the data points. SED has been used in classification, clustering, image processing, and other areas to save the computational expenses. In this paper, we show how SED can be calculated for the vertical data represented in pTrees. We also analyze its performance and compared it with traditional horizontal data representation.","PeriodicalId":264035,"journal":{"name":"International Conference on Computers and Their Applications","volume":"51 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2019-03-13","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"6","resultStr":"{\"title\":\"A New Method of Calculating Squared Euclidean Distance (SED) Using pTree Technology and Its Performance Analysis\",\"authors\":\"Mohammad Hossain, Sameer Abufardeh\",\"doi\":\"10.29007/TRRG\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"One of the advantages of Euclidean distance is that it measures the regular distance between two points in space. For this reason, it is widely used in the applications where the distance between data points are needed to be calculated to measure similarities. However, this method is costly as there involve expensive square and square root operations. One useful observation is that in many data mining applications absolute distance measures are not necessary as long as the distances are used to compare the closeness between various data points. For example, in classification and clustering, we often measure the distances of multiple data points to compare their distances from known classes or from centroids to assign those points in a class or in a cluster. In this regards, an alternative approach known as Squared Euclidean Distance (SED) can be used to avoid the computation of square root to get the squared distance between the data points. SED has been used in classification, clustering, image processing, and other areas to save the computational expenses. In this paper, we show how SED can be calculated for the vertical data represented in pTrees. We also analyze its performance and compared it with traditional horizontal data representation.\",\"PeriodicalId\":264035,\"journal\":{\"name\":\"International Conference on Computers and Their Applications\",\"volume\":\"51 1\",\"pages\":\"0\"},\"PeriodicalIF\":0.0000,\"publicationDate\":\"2019-03-13\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"6\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"International Conference on Computers and Their Applications\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/10.29007/TRRG\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"International Conference on Computers and Their Applications","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.29007/TRRG","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
A New Method of Calculating Squared Euclidean Distance (SED) Using pTree Technology and Its Performance Analysis
One of the advantages of Euclidean distance is that it measures the regular distance between two points in space. For this reason, it is widely used in the applications where the distance between data points are needed to be calculated to measure similarities. However, this method is costly as there involve expensive square and square root operations. One useful observation is that in many data mining applications absolute distance measures are not necessary as long as the distances are used to compare the closeness between various data points. For example, in classification and clustering, we often measure the distances of multiple data points to compare their distances from known classes or from centroids to assign those points in a class or in a cluster. In this regards, an alternative approach known as Squared Euclidean Distance (SED) can be used to avoid the computation of square root to get the squared distance between the data points. SED has been used in classification, clustering, image processing, and other areas to save the computational expenses. In this paper, we show how SED can be calculated for the vertical data represented in pTrees. We also analyze its performance and compared it with traditional horizontal data representation.