BPNN anomaly data detection algorithm based on gray wolf algorithm to optimize K-means clustering

2021 2nd International Conference on Artificial Intelligence and Computer Engineering (ICAICE) Pub Date : 2021-11-01 DOI:10.1109/icaice54393.2021.00038

Ming Run-Yang, Xing Feng-Guo, Yuan Feng-Huang, Bing Quan-Chen

{"title":"BPNN anomaly data detection algorithm based on gray wolf algorithm to optimize K-means clustering","authors":"Ming Run-Yang, Xing Feng-Guo, Yuan Feng-Huang, Bing Quan-Chen","doi":"10.1109/icaice54393.2021.00038","DOIUrl":null,"url":null,"abstract":"Aiming at the situation that the K-means clustering algorithm tends to fall into the local optimal solution during the clustering process, and the clustering results are prone to errors, this paper proposes a clustering algorithm based on gray wolf optimization-means, Realize the initial selection of K-means cluster centers through the global optimization ability of the gray wolf optimization algorithm. And update the cluster centers through iterative wolf $\\alpha$ to optimize the K-means clustering algorithm. Aiming at BP neural network as a supervised learning algorithm, prior knowledge of data is required for training, due to the different data types generated by different events, the applicability of BP neural network is not strong, the paper proposes a combination of K-means clustering algorithm based on gray wolf algorithm optimization and BP neural network. Cluster the initial data set through K-means clustering, and label the clustered data, then import the labeled data as a training set into the BP neural network for training, and obtain the final detection model to realize online detection of large amounts of data. The experimental results show that the algorithm proposed in this paper on the IBLK dataset and the Taihu Lake water quality dataset is compared with the traditional K-means clustering algorithm and the random forest algorithm based on firefly optimization proposed in [12] in the IBLK dataset and Taihu Lake. Experimental verification on the water quality data set, the detection rate was increased by 8.9%, 17.7% and 1.15%, 12.6%; the false alarm rate was reduced by 8.1 %, 19.3% and 1.12%, 13.6% respectively.","PeriodicalId":388444,"journal":{"name":"2021 2nd International Conference on Artificial Intelligence and Computer Engineering (ICAICE)","volume":"11 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2021-11-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"2021 2nd International Conference on Artificial Intelligence and Computer Engineering (ICAICE)","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/icaice54393.2021.00038","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}

引用次数: 0

Abstract

Aiming at the situation that the K-means clustering algorithm tends to fall into the local optimal solution during the clustering process, and the clustering results are prone to errors, this paper proposes a clustering algorithm based on gray wolf optimization-means, Realize the initial selection of K-means cluster centers through the global optimization ability of the gray wolf optimization algorithm. And update the cluster centers through iterative wolf $\alpha$ to optimize the K-means clustering algorithm. Aiming at BP neural network as a supervised learning algorithm, prior knowledge of data is required for training, due to the different data types generated by different events, the applicability of BP neural network is not strong, the paper proposes a combination of K-means clustering algorithm based on gray wolf algorithm optimization and BP neural network. Cluster the initial data set through K-means clustering, and label the clustered data, then import the labeled data as a training set into the BP neural network for training, and obtain the final detection model to realize online detection of large amounts of data. The experimental results show that the algorithm proposed in this paper on the IBLK dataset and the Taihu Lake water quality dataset is compared with the traditional K-means clustering algorithm and the random forest algorithm based on firefly optimization proposed in [12] in the IBLK dataset and Taihu Lake. Experimental verification on the water quality data set, the detection rate was increased by 8.9%, 17.7% and 1.15%, 12.6%; the false alarm rate was reduced by 8.1 %, 19.3% and 1.12%, 13.6% respectively.

查看原文本刊更多论文

基于灰狼算法优化k均值聚类的BPNN异常数据检测算法

针对K-means聚类算法在聚类过程中容易陷入局部最优解，聚类结果容易出现误差的情况，本文提出了一种基于灰狼优化-均值的聚类算法，通过灰狼优化算法的全局寻优能力实现K-means聚类中心的初始选择。并通过迭代wolf $\alpha$更新聚类中心，优化K-means聚类算法。针对BP神经网络作为一种监督学习算法，需要对数据进行先验知识的训练，由于不同事件产生的数据类型不同，BP神经网络的适用性不强，本文提出了一种基于灰狼算法优化的K-means聚类算法与BP神经网络相结合。通过K-means聚类对初始数据集进行聚类，并对聚类后的数据进行标记，然后将标记好的数据作为训练集导入BP神经网络进行训练，得到最终的检测模型，实现对大量数据的在线检测。实验结果表明，本文算法在IBLK数据集和太湖水质数据集上与传统的K-means聚类算法和[12]中提出的基于萤火虫优化的随机森林算法在IBLK数据集和太湖水质数据集上进行了比较。在水质数据集上进行实验验证，检出率分别提高8.9%、17.7%和1.15%、12.6%;虚警率分别降低8.1%、19.3%和1.12%、13.6%。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

求助全文

约1分钟内获得全文求助全文

来源期刊

2021 2nd International Conference on Artificial Intelligence and Computer Engineering (ICAICE)

自引率

0.00%

发文量