基于内核的 iVAT，具有自适应群组提取功能

IF 3.1 4区计算机科学 Q3 COMPUTER SCIENCE, ARTIFICIAL INTELLIGENCE

Knowledge and Information Systems Pub Date : 2024-09-06 DOI:10.1007/s10115-024-02189-1

Baojie Zhang, Ye Zhu, Yang Cao, Sutharshan Rajasegarar, Gang Li, Gang Liu

{"title":"基于内核的 iVAT，具有自适应群组提取功能","authors":"Baojie Zhang, Ye Zhu, Yang Cao, Sutharshan Rajasegarar, Gang Li, Gang Liu","doi":"10.1007/s10115-024-02189-1","DOIUrl":null,"url":null,"abstract":"Visual Assessment of cluster Tendency (VAT) is a popular method that visually represents the possible clusters found in a dataset as dark blocks along the diagonal of a reordered dissimilarity image (RDI). Although many variants of the VAT algorithm have been proposed to improve the visualisation quality on different types of datasets, they still suffer from the challenge of extracting clusters with varied densities. In this paper, we focus on overcoming this drawback of VAT algorithms by incorporating kernel methods and also propose a novel adaptive cluster extraction strategy, named CER, to effectively identify the local clusters from the RDI. We examine their effects on an improved VAT method (iVAT) and systematically evaluate the clustering performance on 18 synthetic and real-world datasets. The experimental results reveal that the recently proposed data-dependent dissimilarity measure, namely the Isolation kernel, helps to significantly improve the RDI image for easy cluster identification. Furthermore, the proposed cluster extraction method, CER, outperforms other existing methods on most of the datasets in terms of a series of dissimilarity measures.\n","PeriodicalId":54749,"journal":{"name":"Knowledge and Information Systems","volume":"9 1","pages":""},"PeriodicalIF":3.1000,"publicationDate":"2024-09-06","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":"{\"title\":\"Kernel-based iVAT with adaptive cluster extraction\",\"authors\":\"Baojie Zhang, Ye Zhu, Yang Cao, Sutharshan Rajasegarar, Gang Li, Gang Liu\",\"doi\":\"10.1007/s10115-024-02189-1\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"Visual Assessment of cluster Tendency (VAT) is a popular method that visually represents the possible clusters found in a dataset as dark blocks along the diagonal of a reordered dissimilarity image (RDI). Although many variants of the VAT algorithm have been proposed to improve the visualisation quality on different types of datasets, they still suffer from the challenge of extracting clusters with varied densities. In this paper, we focus on overcoming this drawback of VAT algorithms by incorporating kernel methods and also propose a novel adaptive cluster extraction strategy, named CER, to effectively identify the local clusters from the RDI. We examine their effects on an improved VAT method (iVAT) and systematically evaluate the clustering performance on 18 synthetic and real-world datasets. The experimental results reveal that the recently proposed data-dependent dissimilarity measure, namely the Isolation kernel, helps to significantly improve the RDI image for easy cluster identification. Furthermore, the proposed cluster extraction method, CER, outperforms other existing methods on most of the datasets in terms of a series of dissimilarity measures.\\n\",\"PeriodicalId\":54749,\"journal\":{\"name\":\"Knowledge and Information Systems\",\"volume\":\"9 1\",\"pages\":\"\"},\"PeriodicalIF\":3.1000,\"publicationDate\":\"2024-09-06\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"0\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"Knowledge and Information Systems\",\"FirstCategoryId\":\"94\",\"ListUrlMain\":\"https://doi.org/10.1007/s10115-024-02189-1\",\"RegionNum\":4,\"RegionCategory\":\"计算机科学\",\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"Q3\",\"JCRName\":\"COMPUTER SCIENCE, ARTIFICIAL INTELLIGENCE\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"Knowledge and Information Systems","FirstCategoryId":"94","ListUrlMain":"https://doi.org/10.1007/s10115-024-02189-1","RegionNum":4,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q3","JCRName":"COMPUTER SCIENCE, ARTIFICIAL INTELLIGENCE","Score":null,"Total":0}

引用次数: 0

摘要

聚类倾向可视化评估（VAT）是一种流行的方法，它将数据集中可能存在的聚类直观地表示为沿重排异同图像（RDI）对角线的暗色块。尽管 VAT 算法的许多变体已被提出，以提高不同类型数据集的可视化质量，但它们仍然面临着提取不同密度聚类的挑战。在本文中，我们将重点放在结合核方法来克服 VAT 算法的这一缺点，并提出了一种名为 CER 的新型自适应聚类提取策略，以有效识别 RDI 中的局部聚类。我们研究了它们对改进型 VAT 方法（iVAT）的影响，并在 18 个合成数据集和实际数据集上系统地评估了聚类性能。实验结果表明，最近提出的依赖于数据的差异度量（即隔离核）有助于显著改善 RDI 图像，从而轻松识别聚类。此外，在大多数数据集上，所提出的聚类提取方法 CER 在一系列异质性度量方面都优于其他现有方法。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

Kernel-based iVAT with adaptive cluster extraction

查看原文本刊更多论文

Kernel-based iVAT with adaptive cluster extraction

Visual Assessment of cluster Tendency (VAT) is a popular method that visually represents the possible clusters found in a dataset as dark blocks along the diagonal of a reordered dissimilarity image (RDI). Although many variants of the VAT algorithm have been proposed to improve the visualisation quality on different types of datasets, they still suffer from the challenge of extracting clusters with varied densities. In this paper, we focus on overcoming this drawback of VAT algorithms by incorporating kernel methods and also propose a novel adaptive cluster extraction strategy, named CER, to effectively identify the local clusters from the RDI. We examine their effects on an improved VAT method (iVAT) and systematically evaluate the clustering performance on 18 synthetic and real-world datasets. The experimental results reveal that the recently proposed data-dependent dissimilarity measure, namely the Isolation kernel, helps to significantly improve the RDI image for easy cluster identification. Furthermore, the proposed cluster extraction method, CER, outperforms other existing methods on most of the datasets in terms of a series of dissimilarity measures.

求助全文

通过发布文献求助，成功后即可免费获取论文全文。去求助

来源期刊

Knowledge and Information Systems 工程技术-计算机：人工智能

CiteScore

5.70

自引率

7.40%

发文量

152

审稿时长

7.2 months

期刊介绍： Knowledge and Information Systems (KAIS) provides an international forum for researchers and professionals to share their knowledge and report new advances on all topics related to knowledge systems and advanced information systems. This monthly peer-reviewed archival journal publishes state-of-the-art research reports on emerging topics in KAIS, reviews of important techniques in related areas, and application papers of interest to a general readership.