Robust Clustering Using Hyperdimensional Computing

IF 2.4 Q2 ENGINEERING, ELECTRICAL & ELECTRONIC

IEEE open journal of circuits and systems Pub Date : 2024-03-26 DOI:10.1109/OJCAS.2024.3381508

Lulu Ge;Keshab K. Parhi

{"title":"Robust Clustering Using Hyperdimensional Computing","authors":"Lulu Ge;Keshab K. Parhi","doi":"10.1109/OJCAS.2024.3381508","DOIUrl":null,"url":null,"abstract":"This paper addresses the clustering of data in the hyperdimensional computing (HDC) domain. In prior work, an HDC-based clustering framework, referred to as HDCluster, has been proposed. However, the performance of the existing HDCluster is not robust. The performance of HDCluster is degraded as the hypervectors for the clusters are chosen at random during the initialization step. To overcome this bottleneck, we assign the initial cluster hypervectors by exploring the similarity of the encoded data, referred to as query hypervectors. Intra-cluster hypervectors have a higher similarity than inter-cluster hypervectors. Harnessing the similarity results among query hypervectors, this paper proposes four HDC-based clustering algorithms: similarity-based k-means, equal bin-width histogram, equal bin-height histogram, and similarity-based affinity propagation. Experimental results illustrate that: (i) Compared to the existing HDCluster, our proposed HDC-based clustering algorithms can achieve better accuracy, more robust performance, fewer iterations, and less execution time. Similarity-based affinity propagation outperforms the other three HDC-based clustering algorithms on eight datasets by 2% ~ 38% in clustering accuracy. (ii) Even for one-pass clustering, i.e., without any iterative update of the cluster hypervectors, our proposed algorithms can provide more robust clustering accuracy than HDCluster. (iii) Over eight datasets, five out of eight can achieve higher or comparable accuracy when projected onto the hyperdimensional space. Traditional clustering is more desirable than HDC when the number of clusters, \n<inline-formula> <tex-math>$k$ </tex-math></inline-formula>\n, is large.","PeriodicalId":93442,"journal":{"name":"IEEE open journal of circuits and systems","volume":"5 ","pages":"102-116"},"PeriodicalIF":2.4000,"publicationDate":"2024-03-26","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://ieeexplore.ieee.org/stamp/stamp.jsp?tp=&arnumber=10480378","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"IEEE open journal of circuits and systems","FirstCategoryId":"1085","ListUrlMain":"https://ieeexplore.ieee.org/document/10480378/","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q2","JCRName":"ENGINEERING, ELECTRICAL & ELECTRONIC","Score":null,"Total":0}

引用次数: 0

Abstract

This paper addresses the clustering of data in the hyperdimensional computing (HDC) domain. In prior work, an HDC-based clustering framework, referred to as HDCluster, has been proposed. However, the performance of the existing HDCluster is not robust. The performance of HDCluster is degraded as the hypervectors for the clusters are chosen at random during the initialization step. To overcome this bottleneck, we assign the initial cluster hypervectors by exploring the similarity of the encoded data, referred to as query hypervectors. Intra-cluster hypervectors have a higher similarity than inter-cluster hypervectors. Harnessing the similarity results among query hypervectors, this paper proposes four HDC-based clustering algorithms: similarity-based k-means, equal bin-width histogram, equal bin-height histogram, and similarity-based affinity propagation. Experimental results illustrate that: (i) Compared to the existing HDCluster, our proposed HDC-based clustering algorithms can achieve better accuracy, more robust performance, fewer iterations, and less execution time. Similarity-based affinity propagation outperforms the other three HDC-based clustering algorithms on eight datasets by 2% ~ 38% in clustering accuracy. (ii) Even for one-pass clustering, i.e., without any iterative update of the cluster hypervectors, our proposed algorithms can provide more robust clustering accuracy than HDCluster. (iii) Over eight datasets, five out of eight can achieve higher or comparable accuracy when projected onto the hyperdimensional space. Traditional clustering is more desirable than HDC when the number of clusters,

$k$

, is large.

查看原文本刊更多论文

利用超维计算进行稳健聚类

本文探讨了超维计算（HDC）领域的数据聚类问题。在之前的工作中，已经提出了一种基于 HDC 的聚类框架，称为 HDCluster。然而，现有 HDCluster 的性能并不稳定。由于簇的超向量是在初始化步骤中随机选择的，因此 HDCluster 的性能有所下降。为了克服这一瓶颈，我们通过探索编码数据的相似性来分配初始簇超向量，即查询超向量。簇内超向量比簇间超向量具有更高的相似性。利用查询超向量之间的相似性结果，本文提出了四种基于 HDC 的聚类算法：基于相似性的 K-均值、等二进制宽度直方图、等二进制高度直方图和基于相似性的亲和传播。实验结果表明(i) 与现有的 HDCluster 相比，我们提出的基于 HDC 的聚类算法可以获得更好的准确性、更稳健的性能、更少的迭代次数和更短的执行时间。在八个数据集上，基于相似性的亲和传播聚类算法的聚类准确率比其他三种基于 HDC 的聚类算法高出 2% ~ 38%。(ii) 即使是一次聚类，即不对聚类超向量进行任何迭代更新，我们提出的算法也能提供比 HDCluster 更稳健的聚类精度。(iii) 在 8 个数据集中，有 5 个数据集在投射到超维空间时可以达到更高或相当的精度。当簇的数量（$k$）较大时，传统聚类比 HDC 更理想。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

求助全文

约1分钟内获得全文求助全文

来源期刊

IEEE open journal of circuits and systems

自引率

0.00%

发文量

审稿时长

19 weeks