{"title":"Hierarchical Fuzzy-Cluster-Aware Grid Layout for Large-Scale Data.","authors":"Yuxing Zhou, Changjian Chen, Zhiyang Shen, Jiangning Zhu, Jiashu Chen, Weikai Yang, Shixia Liu","doi":"10.1109/TVCG.2025.3566558","DOIUrl":null,"url":null,"abstract":"<p><p>Fuzzy clusters, where ambiguous samples belong to multiple clusters, are common in real-world applications. Analyzing such ambiguous samples in large-scale datasets is crucial for practical applications, such as diagnosing machine learning models. A promising method to support such analysis is through hierarchical cluster-aware grid visualizations, which offer high space efficiency and clear cluster perception. However, existing cluster-aware grid layout methods cannot clarify ambiguity among fuzzy clusters, which limits their effectiveness in fuzzy cluster analysis. To tackle this issue, we introduce a hierarchical fuzzy-cluster-aware grid layout method that supports hierarchical exploration of large-scale datasets. Throughout the hierarchical exploration, it is crucial to facilitate fuzzy cluster analysis while maintaining visual continuity for users. To achieve this, we propose a two-step optimization strategy for enhancing cluster perception, clarifying ambiguity, and preserving stability during the exploration. The first step is to create cluster-aware partitions, where each partition corresponds to a cluster. This step focuses on enhancing cluster perception and maintaining the previous shapes and positions of clusters to preserve stability at the cluster level. The second step is to generate a grid layout for each partition. In addition to placing similar samples together, this step also places ambiguous samples near the boundaries to clarify ambiguity and reveal the root causes of their occurrences and maintains the relative positions of the samples in the same cluster to preserve stability at the sample level. Several quantitative experiments and a use case are conducted to demonstrate the effectiveness and usefulness of our method in analyzing large-scale datasets, especially in fuzzy cluster analysis.</p>","PeriodicalId":94035,"journal":{"name":"IEEE transactions on visualization and computer graphics","volume":"PP ","pages":""},"PeriodicalIF":0.0000,"publicationDate":"2025-05-02","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"IEEE transactions on visualization and computer graphics","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/TVCG.2025.3566558","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 0
Abstract
Fuzzy clusters, where ambiguous samples belong to multiple clusters, are common in real-world applications. Analyzing such ambiguous samples in large-scale datasets is crucial for practical applications, such as diagnosing machine learning models. A promising method to support such analysis is through hierarchical cluster-aware grid visualizations, which offer high space efficiency and clear cluster perception. However, existing cluster-aware grid layout methods cannot clarify ambiguity among fuzzy clusters, which limits their effectiveness in fuzzy cluster analysis. To tackle this issue, we introduce a hierarchical fuzzy-cluster-aware grid layout method that supports hierarchical exploration of large-scale datasets. Throughout the hierarchical exploration, it is crucial to facilitate fuzzy cluster analysis while maintaining visual continuity for users. To achieve this, we propose a two-step optimization strategy for enhancing cluster perception, clarifying ambiguity, and preserving stability during the exploration. The first step is to create cluster-aware partitions, where each partition corresponds to a cluster. This step focuses on enhancing cluster perception and maintaining the previous shapes and positions of clusters to preserve stability at the cluster level. The second step is to generate a grid layout for each partition. In addition to placing similar samples together, this step also places ambiguous samples near the boundaries to clarify ambiguity and reveal the root causes of their occurrences and maintains the relative positions of the samples in the same cluster to preserve stability at the sample level. Several quantitative experiments and a use case are conducted to demonstrate the effectiveness and usefulness of our method in analyzing large-scale datasets, especially in fuzzy cluster analysis.