基于密度的DBSCAN聚类算法

2020 7th International Forum on Electrical Engineering and Automation (IFEEA) Pub Date : 2020-09-01 DOI:10.1109/IFEEA51475.2020.00199

Dingsheng Deng

{"title":"基于密度的DBSCAN聚类算法","authors":"Dingsheng Deng","doi":"10.1109/IFEEA51475.2020.00199","DOIUrl":null,"url":null,"abstract":"Clustering technology has important applications in data mining, pattern recognition, machine learning and other fields. However, with the explosive growth of data, traditional clustering algorithm is more and more difficult to meet the needs of big data analysis. How to improve the traditional clustering algorithm and ensure the quality and efficiency of clustering under the background of big data has become an important research topic of artificial intelligence and big data processing. The density-based clustering algorithm can cluster arbitrarily shaped data sets in the case of unknown data distribution. DBSCAN is a classical density-based clustering algorithm, which is widely used for data clustering analysis due to its simple and efficient characteristics. The purpose of this paper is to study DBSCAN clustering algorithm based on density. This paper first introduces the concept of DBSCAN algorithm, and then carries out performance tests on DBSCAN algorithm in three different data sets. By analyzing the experimental results, it can be concluded that DBSCAN algorithm has higher homogeneity and diversity when it performs personalized clustering on data sets of non-uniform density with broad values and gradually sparse forwards. When the DBSCAN algorithm's neighborhood distance eps is 1000, 26 classes are generated after clustering.","PeriodicalId":285980,"journal":{"name":"2020 7th International Forum on Electrical Engineering and Automation (IFEEA)","volume":"19 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2020-09-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"40","resultStr":"{\"title\":\"DBSCAN Clustering Algorithm Based on Density\",\"authors\":\"Dingsheng Deng\",\"doi\":\"10.1109/IFEEA51475.2020.00199\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"Clustering technology has important applications in data mining, pattern recognition, machine learning and other fields. However, with the explosive growth of data, traditional clustering algorithm is more and more difficult to meet the needs of big data analysis. How to improve the traditional clustering algorithm and ensure the quality and efficiency of clustering under the background of big data has become an important research topic of artificial intelligence and big data processing. The density-based clustering algorithm can cluster arbitrarily shaped data sets in the case of unknown data distribution. DBSCAN is a classical density-based clustering algorithm, which is widely used for data clustering analysis due to its simple and efficient characteristics. The purpose of this paper is to study DBSCAN clustering algorithm based on density. This paper first introduces the concept of DBSCAN algorithm, and then carries out performance tests on DBSCAN algorithm in three different data sets. By analyzing the experimental results, it can be concluded that DBSCAN algorithm has higher homogeneity and diversity when it performs personalized clustering on data sets of non-uniform density with broad values and gradually sparse forwards. When the DBSCAN algorithm's neighborhood distance eps is 1000, 26 classes are generated after clustering.\",\"PeriodicalId\":285980,\"journal\":{\"name\":\"2020 7th International Forum on Electrical Engineering and Automation (IFEEA)\",\"volume\":\"19 1\",\"pages\":\"0\"},\"PeriodicalIF\":0.0000,\"publicationDate\":\"2020-09-01\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"40\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"2020 7th International Forum on Electrical Engineering and Automation (IFEEA)\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/10.1109/IFEEA51475.2020.00199\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"2020 7th International Forum on Electrical Engineering and Automation (IFEEA)","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/IFEEA51475.2020.00199","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}

引用次数: 40

摘要

聚类技术在数据挖掘、模式识别、机器学习等领域有着重要的应用。然而，随着数据的爆炸式增长，传统的聚类算法越来越难以满足大数据分析的需求。如何在大数据背景下改进传统的聚类算法，保证聚类的质量和效率，已成为人工智能和大数据处理的重要研究课题。基于密度的聚类算法可以在数据分布未知的情况下对任意形状的数据集进行聚类。DBSCAN是一种经典的基于密度的聚类算法，以其简单高效的特点被广泛应用于数据聚类分析。本文的目的是研究基于密度的DBSCAN聚类算法。本文首先介绍了DBSCAN算法的概念，然后在三个不同的数据集上对DBSCAN算法进行了性能测试。通过对实验结果的分析，可以得出DBSCAN算法在对非均匀密度的数据集进行个性化聚类时具有较高的同质性和多样性。当DBSCAN算法的邻域距离eps为1000时，聚类后生成26个类。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

查看原文本刊更多论文

DBSCAN Clustering Algorithm Based on Density

Clustering technology has important applications in data mining, pattern recognition, machine learning and other fields. However, with the explosive growth of data, traditional clustering algorithm is more and more difficult to meet the needs of big data analysis. How to improve the traditional clustering algorithm and ensure the quality and efficiency of clustering under the background of big data has become an important research topic of artificial intelligence and big data processing. The density-based clustering algorithm can cluster arbitrarily shaped data sets in the case of unknown data distribution. DBSCAN is a classical density-based clustering algorithm, which is widely used for data clustering analysis due to its simple and efficient characteristics. The purpose of this paper is to study DBSCAN clustering algorithm based on density. This paper first introduces the concept of DBSCAN algorithm, and then carries out performance tests on DBSCAN algorithm in three different data sets. By analyzing the experimental results, it can be concluded that DBSCAN algorithm has higher homogeneity and diversity when it performs personalized clustering on data sets of non-uniform density with broad values and gradually sparse forwards. When the DBSCAN algorithm's neighborhood distance eps is 1000, 26 classes are generated after clustering.

求助全文

通过发布文献求助，成功后即可免费获取论文全文。去求助

来源期刊

2020 7th International Forum on Electrical Engineering and Automation (IFEEA)

自引率

0.00%

发文量