Parallel DBSCAN Clustering Algorithm Using Hadoop Map-reduce Framework for Spatial Data

M. C., C. H
{"title":"Parallel DBSCAN Clustering Algorithm Using Hadoop Map-reduce Framework for Spatial Data","authors":"M. C., C. H","doi":"10.5815/ijitcs.2022.06.01","DOIUrl":null,"url":null,"abstract":"Data clustering is the first step for future applications of big data analysis. It is a driving model for Artificial Intelligence and Machine Learning architectures. Processing large volumes of data in faster mode is a big challenge in these applications. which requires fast and efficient algorithms for handling big data. Parallel clustering algorithms are one promising design, which increases the speed of handling such big data. In this paper, a parallel algorithm for clustering a spatial dataset called the P-DBSCAN algorithm is implemented using Hadoop map-reduce framework. This research paper signifies the improvement for data clustering in data analytic applications. The new P-DBSCAN algorithm is executed over generated dataset. The result of this parallel algorithm is compared with existing DBSCAN algorithm to show improvement of runtime performance. This work offers an increase in the performance of execution time. In addition, the outcome of P-DBSCAN shows how to resolve the scalability problem of a large data set.","PeriodicalId":130361,"journal":{"name":"International Journal of Information Technology and Computer Science","volume":"10 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2022-12-08","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"2","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"International Journal of Information Technology and Computer Science","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.5815/ijitcs.2022.06.01","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 2

Abstract

Data clustering is the first step for future applications of big data analysis. It is a driving model for Artificial Intelligence and Machine Learning architectures. Processing large volumes of data in faster mode is a big challenge in these applications. which requires fast and efficient algorithms for handling big data. Parallel clustering algorithms are one promising design, which increases the speed of handling such big data. In this paper, a parallel algorithm for clustering a spatial dataset called the P-DBSCAN algorithm is implemented using Hadoop map-reduce framework. This research paper signifies the improvement for data clustering in data analytic applications. The new P-DBSCAN algorithm is executed over generated dataset. The result of this parallel algorithm is compared with existing DBSCAN algorithm to show improvement of runtime performance. This work offers an increase in the performance of execution time. In addition, the outcome of P-DBSCAN shows how to resolve the scalability problem of a large data set.
基于Hadoop Map-reduce框架的空间数据并行DBSCAN聚类算法
数据聚类是未来大数据分析应用的第一步。它是人工智能和机器学习架构的驱动模型。在这些应用程序中,以更快的模式处理大量数据是一个很大的挑战。这就需要快速高效的算法来处理大数据。并行聚类算法是一种很有前途的设计,它可以提高处理此类大数据的速度。本文使用Hadoop map-reduce框架实现了一种用于空间数据集聚类的并行算法P-DBSCAN算法。本研究对数据分析应用中数据聚类的改进具有重要意义。新的P-DBSCAN算法在生成的数据集上执行。将该并行算法与现有的DBSCAN算法进行了比较,结果表明该算法在运行时性能上有所提高。这项工作提高了执行时间的性能。此外,P-DBSCAN的结果显示了如何解决大型数据集的可伸缩性问题。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
求助全文
约1分钟内获得全文 求助全文
来源期刊
自引率
0.00%
发文量
0
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
确定
请完成安全验证×
copy
已复制链接
快去分享给好友吧!
我知道了
右上角分享
点击右上角分享
0
联系我们:info@booksci.cn Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。 Copyright © 2023 布克学术 All rights reserved.
京ICP备2023020795号-1
ghs 京公网安备 11010802042870号
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术官方微信