DimenFix: A novel meta-strategy to preserve user-defined data values on dimensionality reduction layouts

IF 2.5 4区 计算机科学 Q2 COMPUTER SCIENCE, SOFTWARE ENGINEERING
Zixuan Han , Diede van der Hoorn , Thomas Höllt , Qiaodan Luo , Leonardo Christino , Evangelos Milios , Fernando V. Paulovich
{"title":"DimenFix: A novel meta-strategy to preserve user-defined data values on dimensionality reduction layouts","authors":"Zixuan Han ,&nbsp;Diede van der Hoorn ,&nbsp;Thomas Höllt ,&nbsp;Qiaodan Luo ,&nbsp;Leonardo Christino ,&nbsp;Evangelos Milios ,&nbsp;Fernando V. Paulovich","doi":"10.1016/j.cag.2025.104231","DOIUrl":null,"url":null,"abstract":"<div><div>Dimensionality Reduction (DR) methods have become essential tools for the data analysis toolbox. Typically, DR methods combine features of a multivariate dataset to produce dimensions in a reduced space, preserving some data properties, usually pairwise distances or local neighborhoods. Preserving such properties makes DR methods attractive, but it is also one of their weaknesses. When calculating the embedded dimensions, usually through non-linear strategies, the original feature values are lost and not explicitly represented in the spatialization of the produced layouts, making it challenging to interpret the results and understand the features’ contributions to the attained representations. Some strategies have been proposed to tackle this issue, such as coloring the DR layouts or generating explanations. Still, they are post-processes, so specific features (values) are not guaranteed to be preserved or represented. This paper proposes <em>DimenFix</em>, a novel meta-DR strategy that explicitly preserves the values of a particular user-defined feature or external data (not used to generate a layout) in one of the embedded axes. <em>DimenFix</em> can be used to preserve ordinal (e.g., numerical measures) and nominal (e.g., labels) values and works with virtually any gradient-descent DR method. It requires minimum changes to the underlying DR technique, running in linear time considering the number of data instances. In our results, involving Force Scheme and t-SNE adaptations, <em>DimenFix</em> was capable of representing features without heavily impacting distance or neighborhood preservation, allowing for creating hybrid layouts that join characteristics of scatter plots and DR methods.</div></div>","PeriodicalId":50628,"journal":{"name":"Computers & Graphics-Uk","volume":"130 ","pages":"Article 104231"},"PeriodicalIF":2.5000,"publicationDate":"2025-05-22","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Computers & Graphics-Uk","FirstCategoryId":"94","ListUrlMain":"https://www.sciencedirect.com/science/article/pii/S009784932500072X","RegionNum":4,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q2","JCRName":"COMPUTER SCIENCE, SOFTWARE ENGINEERING","Score":null,"Total":0}
引用次数: 0

Abstract

Dimensionality Reduction (DR) methods have become essential tools for the data analysis toolbox. Typically, DR methods combine features of a multivariate dataset to produce dimensions in a reduced space, preserving some data properties, usually pairwise distances or local neighborhoods. Preserving such properties makes DR methods attractive, but it is also one of their weaknesses. When calculating the embedded dimensions, usually through non-linear strategies, the original feature values are lost and not explicitly represented in the spatialization of the produced layouts, making it challenging to interpret the results and understand the features’ contributions to the attained representations. Some strategies have been proposed to tackle this issue, such as coloring the DR layouts or generating explanations. Still, they are post-processes, so specific features (values) are not guaranteed to be preserved or represented. This paper proposes DimenFix, a novel meta-DR strategy that explicitly preserves the values of a particular user-defined feature or external data (not used to generate a layout) in one of the embedded axes. DimenFix can be used to preserve ordinal (e.g., numerical measures) and nominal (e.g., labels) values and works with virtually any gradient-descent DR method. It requires minimum changes to the underlying DR technique, running in linear time considering the number of data instances. In our results, involving Force Scheme and t-SNE adaptations, DimenFix was capable of representing features without heavily impacting distance or neighborhood preservation, allowing for creating hybrid layouts that join characteristics of scatter plots and DR methods.
DimenFix:一种新颖的元策略,用于在降维布局上保留用户定义的数据值
降维(DR)方法已经成为数据分析工具箱中必不可少的工具。通常,DR方法结合多元数据集的特征在约简空间中产生维度,保留一些数据属性,通常是成对距离或局部邻域。保留这些特性使DR方法具有吸引力,但这也是它们的弱点之一。在计算嵌入维度时,通常通过非线性策略,原始特征值会丢失,并且不会在生成的布局的空间化中明确表示,这使得解释结果和理解特征对所获得的表示的贡献具有挑战性。有人提出了一些策略来解决这个问题,比如给DR布局上色或生成解释。但是,它们是后处理,因此不能保证保留或表示特定的特征(值)。本文提出了一种新颖的元dr策略,它明确地保留了一个嵌入轴中的特定用户定义特征或外部数据(不用于生成布局)的值。DimenFix可用于保存序数(例如,数值度量)和标称(例如,标签)值,并与几乎任何梯度下降DR方法一起工作。它需要对底层DR技术进行最小的更改,考虑到数据实例的数量,在线性时间内运行。在我们的结果中,涉及Force Scheme和t-SNE适应,DimenFix能够在不严重影响距离或邻居保存的情况下表示特征,允许创建混合布局,将散点图和DR方法的特征结合起来。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
求助全文
约1分钟内获得全文 求助全文
来源期刊
Computers & Graphics-Uk
Computers & Graphics-Uk 工程技术-计算机:软件工程
CiteScore
5.30
自引率
12.00%
发文量
173
审稿时长
38 days
期刊介绍: Computers & Graphics is dedicated to disseminate information on research and applications of computer graphics (CG) techniques. The journal encourages articles on: 1. Research and applications of interactive computer graphics. We are particularly interested in novel interaction techniques and applications of CG to problem domains. 2. State-of-the-art papers on late-breaking, cutting-edge research on CG. 3. Information on innovative uses of graphics principles and technologies. 4. Tutorial papers on both teaching CG principles and innovative uses of CG in education.
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
确定
请完成安全验证×
copy
已复制链接
快去分享给好友吧!
我知道了
右上角分享
点击右上角分享
0
联系我们:info@booksci.cn Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。 Copyright © 2023 布克学术 All rights reserved.
京ICP备2023020795号-1
ghs 京公网安备 11010802042870号
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术官方微信