Hierarchical Fuzzy-Cluster-Aware Grid Layout for Large-Scale Data.

IEEE transactions on visualization and computer graphics Pub Date : 2025-05-02 DOI:10.1109/TVCG.2025.3566558

Yuxing Zhou, Changjian Chen, Zhiyang Shen, Jiangning Zhu, Jiashu Chen, Weikai Yang, Shixia Liu

{"title":"Hierarchical Fuzzy-Cluster-Aware Grid Layout for Large-Scale Data.","authors":"Yuxing Zhou, Changjian Chen, Zhiyang Shen, Jiangning Zhu, Jiashu Chen, Weikai Yang, Shixia Liu","doi":"10.1109/TVCG.2025.3566558","DOIUrl":null,"url":null,"abstract":"<p><p>Fuzzy clusters, where ambiguous samples belong to multiple clusters, are common in real-world applications. Analyzing such ambiguous samples in large-scale datasets is crucial for practical applications, such as diagnosing machine learning models. A promising method to support such analysis is through hierarchical cluster-aware grid visualizations, which offer high space efficiency and clear cluster perception. However, existing cluster-aware grid layout methods cannot clarify ambiguity among fuzzy clusters, which limits their effectiveness in fuzzy cluster analysis. To tackle this issue, we introduce a hierarchical fuzzy-cluster-aware grid layout method that supports hierarchical exploration of large-scale datasets. Throughout the hierarchical exploration, it is crucial to facilitate fuzzy cluster analysis while maintaining visual continuity for users. To achieve this, we propose a two-step optimization strategy for enhancing cluster perception, clarifying ambiguity, and preserving stability during the exploration. The first step is to create cluster-aware partitions, where each partition corresponds to a cluster. This step focuses on enhancing cluster perception and maintaining the previous shapes and positions of clusters to preserve stability at the cluster level. The second step is to generate a grid layout for each partition. In addition to placing similar samples together, this step also places ambiguous samples near the boundaries to clarify ambiguity and reveal the root causes of their occurrences and maintains the relative positions of the samples in the same cluster to preserve stability at the sample level. Several quantitative experiments and a use case are conducted to demonstrate the effectiveness and usefulness of our method in analyzing large-scale datasets, especially in fuzzy cluster analysis.</p>","PeriodicalId":94035,"journal":{"name":"IEEE transactions on visualization and computer graphics","volume":"PP ","pages":""},"PeriodicalIF":0.0000,"publicationDate":"2025-05-02","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"IEEE transactions on visualization and computer graphics","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/TVCG.2025.3566558","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}

引用次数: 0

Abstract

Fuzzy clusters, where ambiguous samples belong to multiple clusters, are common in real-world applications. Analyzing such ambiguous samples in large-scale datasets is crucial for practical applications, such as diagnosing machine learning models. A promising method to support such analysis is through hierarchical cluster-aware grid visualizations, which offer high space efficiency and clear cluster perception. However, existing cluster-aware grid layout methods cannot clarify ambiguity among fuzzy clusters, which limits their effectiveness in fuzzy cluster analysis. To tackle this issue, we introduce a hierarchical fuzzy-cluster-aware grid layout method that supports hierarchical exploration of large-scale datasets. Throughout the hierarchical exploration, it is crucial to facilitate fuzzy cluster analysis while maintaining visual continuity for users. To achieve this, we propose a two-step optimization strategy for enhancing cluster perception, clarifying ambiguity, and preserving stability during the exploration. The first step is to create cluster-aware partitions, where each partition corresponds to a cluster. This step focuses on enhancing cluster perception and maintaining the previous shapes and positions of clusters to preserve stability at the cluster level. The second step is to generate a grid layout for each partition. In addition to placing similar samples together, this step also places ambiguous samples near the boundaries to clarify ambiguity and reveal the root causes of their occurrences and maintains the relative positions of the samples in the same cluster to preserve stability at the sample level. Several quantitative experiments and a use case are conducted to demonstrate the effectiveness and usefulness of our method in analyzing large-scale datasets, especially in fuzzy cluster analysis.

查看原文本刊更多论文

面向大规模数据的分层模糊簇感知网格布局。

模糊聚类是指模糊样本属于多个聚类，这在实际应用中很常见。在大规模数据集中分析这种模糊样本对于实际应用至关重要，例如诊断机器学习模型。支持这种分析的一种有前途的方法是通过分层簇感知网格可视化，它提供了高空间效率和清晰的簇感知。然而，现有的聚类感知网格布局方法无法明确模糊聚类之间的模糊性，限制了其在模糊聚类分析中的有效性。为了解决这个问题，我们引入了一种分层模糊簇感知网格布局方法，该方法支持大规模数据集的分层探索。在整个层次探索中，在保持用户视觉连续性的同时促进模糊聚类分析是至关重要的。为了实现这一目标，我们提出了一个两步优化策略，以增强聚类感知，澄清歧义，并在探索过程中保持稳定性。第一步是创建集群感知分区，其中每个分区对应于一个集群。这一步的重点是增强集群感知，并保持集群先前的形状和位置，以保持集群级别的稳定性。第二步是为每个分区生成网格布局。除了将相似的样本放在一起外，该步骤还将模糊样本放在边界附近，以澄清模糊性并揭示其发生的根本原因，并保持样本在同一聚类中的相对位置，以保持样本水平上的稳定性。几个定量实验和一个用例证明了我们的方法在分析大规模数据集，特别是在模糊聚类分析中的有效性和实用性。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

求助全文

约1分钟内获得全文求助全文

来源期刊

IEEE transactions on visualization and computer graphics

自引率

0.00%

发文量