棱镜:多维混合模型的探索

IF 2.9 4区 计算机科学 Q2 COMPUTER SCIENCE, SOFTWARE ENGINEERING
B. Zahoransky, T. Günther, K. Lawonn
{"title":"棱镜:多维混合模型的探索","authors":"B. Zahoransky,&nbsp;T. Günther,&nbsp;K. Lawonn","doi":"10.1111/cgf.70121","DOIUrl":null,"url":null,"abstract":"<div>\n \n <p>In data science, visual data exploration becomes increasingly more challenging due to the continued rapid increase of data dimensionality and data sizes. To manage complexity, two orthogonal approaches are commonly used in practice: First, data is frequently clustered in high-dimensional space by fitting mixture models composed of normal distributions or Student t-distributions. Second, dimensionality reduction is employed to embed high-dimensional point clouds in a two- or three-dimensional space. Those algorithms determine the spatial arrangement in low-dimensional space without further user interaction. This leaves little room for a guided exploration and data analysis. In this paper, we propose a novel visualization system for the effective exploration and construction of potential subspaces onto which mixture models can be projected. The subspaces are spanned linearly via basis vectors, for which a vast number of basis vector combinations is theoretically imaginable. Our system guides the user step-by-step through the selection process by letting users choose one basis vector at a time. To guide the process, multiple choices are pre-visualized at once on a multi-faceted prism. In addition to the qualitative visualization of the distributions, multiple quantitative metrics are calculated by which subspaces can be compared and reordered, including variance, sparsity, and visibility. Further, a bookmarking tool lets users record and compare different basis vector combinations. The usability of the system is evaluated by data scientists and is tested on several high-dimensional data sets.</p>\n </div>","PeriodicalId":10687,"journal":{"name":"Computer Graphics Forum","volume":"44 3","pages":""},"PeriodicalIF":2.9000,"publicationDate":"2025-05-23","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://onlinelibrary.wiley.com/doi/epdf/10.1111/cgf.70121","citationCount":"0","resultStr":"{\"title\":\"PrismBreak: Exploration of Multi-Dimensional Mixture Models\",\"authors\":\"B. Zahoransky,&nbsp;T. Günther,&nbsp;K. Lawonn\",\"doi\":\"10.1111/cgf.70121\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"<div>\\n \\n <p>In data science, visual data exploration becomes increasingly more challenging due to the continued rapid increase of data dimensionality and data sizes. To manage complexity, two orthogonal approaches are commonly used in practice: First, data is frequently clustered in high-dimensional space by fitting mixture models composed of normal distributions or Student t-distributions. Second, dimensionality reduction is employed to embed high-dimensional point clouds in a two- or three-dimensional space. Those algorithms determine the spatial arrangement in low-dimensional space without further user interaction. This leaves little room for a guided exploration and data analysis. In this paper, we propose a novel visualization system for the effective exploration and construction of potential subspaces onto which mixture models can be projected. The subspaces are spanned linearly via basis vectors, for which a vast number of basis vector combinations is theoretically imaginable. Our system guides the user step-by-step through the selection process by letting users choose one basis vector at a time. To guide the process, multiple choices are pre-visualized at once on a multi-faceted prism. In addition to the qualitative visualization of the distributions, multiple quantitative metrics are calculated by which subspaces can be compared and reordered, including variance, sparsity, and visibility. Further, a bookmarking tool lets users record and compare different basis vector combinations. The usability of the system is evaluated by data scientists and is tested on several high-dimensional data sets.</p>\\n </div>\",\"PeriodicalId\":10687,\"journal\":{\"name\":\"Computer Graphics Forum\",\"volume\":\"44 3\",\"pages\":\"\"},\"PeriodicalIF\":2.9000,\"publicationDate\":\"2025-05-23\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"https://onlinelibrary.wiley.com/doi/epdf/10.1111/cgf.70121\",\"citationCount\":\"0\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"Computer Graphics Forum\",\"FirstCategoryId\":\"94\",\"ListUrlMain\":\"https://onlinelibrary.wiley.com/doi/10.1111/cgf.70121\",\"RegionNum\":4,\"RegionCategory\":\"计算机科学\",\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"Q2\",\"JCRName\":\"COMPUTER SCIENCE, SOFTWARE ENGINEERING\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"Computer Graphics Forum","FirstCategoryId":"94","ListUrlMain":"https://onlinelibrary.wiley.com/doi/10.1111/cgf.70121","RegionNum":4,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q2","JCRName":"COMPUTER SCIENCE, SOFTWARE ENGINEERING","Score":null,"Total":0}
引用次数: 0

摘要

在数据科学中,由于数据维度和数据规模的持续快速增长,可视化数据探索变得越来越具有挑战性。为了管理复杂性,实践中通常使用两种正交方法:首先,通过拟合由正态分布或学生t分布组成的混合模型,将数据频繁聚集在高维空间中。其次,采用降维方法在二维或三维空间中嵌入高维点云。这些算法在没有进一步用户交互的情况下确定低维空间的空间排列。这就给引导探索和数据分析留下了很小的空间。在本文中,我们提出了一种新的可视化系统,用于有效地探索和构建混合模型可以投影到的潜在子空间。子空间是通过基向量线性张成的,理论上可以想象出大量的基向量组合。我们的系统通过让用户一次选择一个基向量来指导用户逐步完成选择过程。为了指导这一过程,多个选择在一个多面棱镜上被预先可视化。除了分布的定性可视化之外,还计算了多个定量度量,通过这些度量可以对子空间进行比较和重新排序,包括方差、稀疏性和可见性。此外,书签工具允许用户记录和比较不同的基向量组合。系统的可用性由数据科学家进行评估,并在几个高维数据集上进行测试。
本文章由计算机程序翻译,如有差异,请以英文原文为准。

PrismBreak: Exploration of Multi-Dimensional Mixture Models

PrismBreak: Exploration of Multi-Dimensional Mixture Models

In data science, visual data exploration becomes increasingly more challenging due to the continued rapid increase of data dimensionality and data sizes. To manage complexity, two orthogonal approaches are commonly used in practice: First, data is frequently clustered in high-dimensional space by fitting mixture models composed of normal distributions or Student t-distributions. Second, dimensionality reduction is employed to embed high-dimensional point clouds in a two- or three-dimensional space. Those algorithms determine the spatial arrangement in low-dimensional space without further user interaction. This leaves little room for a guided exploration and data analysis. In this paper, we propose a novel visualization system for the effective exploration and construction of potential subspaces onto which mixture models can be projected. The subspaces are spanned linearly via basis vectors, for which a vast number of basis vector combinations is theoretically imaginable. Our system guides the user step-by-step through the selection process by letting users choose one basis vector at a time. To guide the process, multiple choices are pre-visualized at once on a multi-faceted prism. In addition to the qualitative visualization of the distributions, multiple quantitative metrics are calculated by which subspaces can be compared and reordered, including variance, sparsity, and visibility. Further, a bookmarking tool lets users record and compare different basis vector combinations. The usability of the system is evaluated by data scientists and is tested on several high-dimensional data sets.

求助全文
通过发布文献求助,成功后即可免费获取论文全文。 去求助
来源期刊
Computer Graphics Forum
Computer Graphics Forum 工程技术-计算机:软件工程
CiteScore
5.80
自引率
12.00%
发文量
175
审稿时长
3-6 weeks
期刊介绍: Computer Graphics Forum is the official journal of Eurographics, published in cooperation with Wiley-Blackwell, and is a unique, international source of information for computer graphics professionals interested in graphics developments worldwide. It is now one of the leading journals for researchers, developers and users of computer graphics in both commercial and academic environments. The journal reports on the latest developments in the field throughout the world and covers all aspects of the theory, practice and application of computer graphics.
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
确定
请完成安全验证×
copy
已复制链接
快去分享给好友吧!
我知道了
右上角分享
点击右上角分享
0
联系我们:info@booksci.cn Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。 Copyright © 2023 布克学术 All rights reserved.
京ICP备2023020795号-1
ghs 京公网安备 11010802042870号
Book学术文献互助
Book学术文献互助群
群 号:604180095
Book学术官方微信