一种利用pTree技术计算平方欧氏距离的新方法及其性能分析

Mohammad Hossain, Sameer Abufardeh
{"title":"一种利用pTree技术计算平方欧氏距离的新方法及其性能分析","authors":"Mohammad Hossain, Sameer Abufardeh","doi":"10.29007/TRRG","DOIUrl":null,"url":null,"abstract":"One of the advantages of Euclidean distance is that it measures the regular distance between two points in space. For this reason, it is widely used in the applications where the distance between data points are needed to be calculated to measure similarities. However, this method is costly as there involve expensive square and square root operations. One useful observation is that in many data mining applications absolute distance measures are not necessary as long as the distances are used to compare the closeness between various data points. For example, in classification and clustering, we often measure the distances of multiple data points to compare their distances from known classes or from centroids to assign those points in a class or in a cluster. In this regards, an alternative approach known as Squared Euclidean Distance (SED) can be used to avoid the computation of square root to get the squared distance between the data points. SED has been used in classification, clustering, image processing, and other areas to save the computational expenses. In this paper, we show how SED can be calculated for the vertical data represented in pTrees. We also analyze its performance and compared it with traditional horizontal data representation.","PeriodicalId":264035,"journal":{"name":"International Conference on Computers and Their Applications","volume":"51 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2019-03-13","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"6","resultStr":"{\"title\":\"A New Method of Calculating Squared Euclidean Distance (SED) Using pTree Technology and Its Performance Analysis\",\"authors\":\"Mohammad Hossain, Sameer Abufardeh\",\"doi\":\"10.29007/TRRG\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"One of the advantages of Euclidean distance is that it measures the regular distance between two points in space. For this reason, it is widely used in the applications where the distance between data points are needed to be calculated to measure similarities. However, this method is costly as there involve expensive square and square root operations. One useful observation is that in many data mining applications absolute distance measures are not necessary as long as the distances are used to compare the closeness between various data points. For example, in classification and clustering, we often measure the distances of multiple data points to compare their distances from known classes or from centroids to assign those points in a class or in a cluster. In this regards, an alternative approach known as Squared Euclidean Distance (SED) can be used to avoid the computation of square root to get the squared distance between the data points. SED has been used in classification, clustering, image processing, and other areas to save the computational expenses. In this paper, we show how SED can be calculated for the vertical data represented in pTrees. We also analyze its performance and compared it with traditional horizontal data representation.\",\"PeriodicalId\":264035,\"journal\":{\"name\":\"International Conference on Computers and Their Applications\",\"volume\":\"51 1\",\"pages\":\"0\"},\"PeriodicalIF\":0.0000,\"publicationDate\":\"2019-03-13\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"6\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"International Conference on Computers and Their Applications\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/10.29007/TRRG\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"International Conference on Computers and Their Applications","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.29007/TRRG","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 6

摘要

欧几里得距离的优点之一是它测量空间中两点之间的规则距离。由于这个原因,它被广泛用于需要计算数据点之间的距离来测量相似性的应用程序中。然而,这种方法是昂贵的,因为它涉及昂贵的平方和平方根操作。一个有用的观察是,在许多数据挖掘应用程序中,只要距离用于比较不同数据点之间的接近程度,就不需要绝对距离度量。例如,在分类和聚类中,我们经常测量多个数据点的距离,以比较它们与已知类或质心的距离,从而将这些点分配到类或聚类中。在这方面,可以使用另一种称为平方欧几里得距离(SED)的方法来避免计算平方根来获得数据点之间的平方距离。SED已被用于分类、聚类、图像处理等领域,以节省计算费用。在本文中,我们展示了如何为pTrees中表示的垂直数据计算SED。对其性能进行了分析,并与传统的横向数据表示进行了比较。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
A New Method of Calculating Squared Euclidean Distance (SED) Using pTree Technology and Its Performance Analysis
One of the advantages of Euclidean distance is that it measures the regular distance between two points in space. For this reason, it is widely used in the applications where the distance between data points are needed to be calculated to measure similarities. However, this method is costly as there involve expensive square and square root operations. One useful observation is that in many data mining applications absolute distance measures are not necessary as long as the distances are used to compare the closeness between various data points. For example, in classification and clustering, we often measure the distances of multiple data points to compare their distances from known classes or from centroids to assign those points in a class or in a cluster. In this regards, an alternative approach known as Squared Euclidean Distance (SED) can be used to avoid the computation of square root to get the squared distance between the data points. SED has been used in classification, clustering, image processing, and other areas to save the computational expenses. In this paper, we show how SED can be calculated for the vertical data represented in pTrees. We also analyze its performance and compared it with traditional horizontal data representation.
求助全文
通过发布文献求助,成功后即可免费获取论文全文。 去求助
来源期刊
自引率
0.00%
发文量
0
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
确定
请完成安全验证×
copy
已复制链接
快去分享给好友吧!
我知道了
右上角分享
点击右上角分享
0
联系我们:info@booksci.cn Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。 Copyright © 2023 布克学术 All rights reserved.
京ICP备2023020795号-1
ghs 京公网安备 11010802042870号
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术官方微信