Lukas Silvester Barth, Hannaneh Fahimi, Parvaneh Joharinad, Jürgen Jost, Janis Keck, Thomas Jan Mikhail
{"title":"模糊简单集及其在几何数据分析中的应用","authors":"Lukas Silvester Barth, Hannaneh Fahimi, Parvaneh Joharinad, Jürgen Jost, Janis Keck, Thomas Jan Mikhail","doi":"10.1007/s10485-025-09827-x","DOIUrl":null,"url":null,"abstract":"<div><p>In this article, we expand upon the concepts introduced in Spivak (Metric realization of fuzzy simplicial sets, 2009. http://www.dspivak.net/metric_realization090922.pdf) about the relationship between the category <span>\\(\\textbf{UM}\\)</span> of uber metric spaces and the category <span>\\(\\textbf{sFuz}\\)</span> of fuzzy simplicial sets. We show that fuzzy simplicial sets can be regarded as natural combinatorial generalizations of metric relations. Furthermore, we take inspiration from UMAP (McInnes et al, in: Umap: Uniform manifold approximation and projection for dimension reduction, 2018) to apply the theory to manifold learning, dimension reduction and data visualization, while refining some of their constructions to put the corresponding theory on a more solid footing. A generalization of the adjunction between <span>\\(\\textbf{UM}\\)</span> and <span>\\(\\textbf{sFuz}\\)</span> allows us to view the adjunctions used in both publications as special cases. Moreover, we derive an explicit description of colimits in <span>\\(\\textbf{UM}\\)</span> and the realization functor <span>\\(\\text {Re}:\\textbf{sFuz}\\rightarrow \\textbf{UM}\\)</span>, and show that <span>\\(\\textbf{UM}\\)</span> can be embedded into <span>\\(\\textbf{sFuz}\\)</span>. Furthermore, we prove analogous results for the category of extended-pseudo metric spaces <span>\\(\\textbf{EPMet}\\)</span>. We also provide rigorous definitions of functors that make it possible to recursively merge sets of fuzzy simplicial sets and provide a description of the adjunctions between the category of truncated fuzzy simplicial sets and <span>\\(\\textbf{sFuz}\\)</span>, which we relate to persistent homology. Combining those constructions, we can show a surprising connection between the well-known dimension reduction methods UMAP and Isomap (Tenenbaum et al. in Science 290(5500):2319–2323, 2000) and derive an alternative algorithm, which we call IsUMap, that combines some of the strengths of both methods. Additionally, we developed a new embedding method that allows to preserve clusters detected in the original metric space that we construct from the data. The visualization of the optimization process gives the user information, both about the inner-cluster distributions in the original metric space and their inter-cluster relations. We compare our new method with UMAP, Isomap and t-SNE on a series of low- and high-dimensional datasets and provide explanations for observed differences and improvements.</p></div>","PeriodicalId":7952,"journal":{"name":"Applied Categorical Structures","volume":"33 5","pages":""},"PeriodicalIF":0.5000,"publicationDate":"2025-09-10","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://link.springer.com/content/pdf/10.1007/s10485-025-09827-x.pdf","citationCount":"0","resultStr":"{\"title\":\"Fuzzy Simplicial Sets and Their Application to Geometric Data Analysis\",\"authors\":\"Lukas Silvester Barth, Hannaneh Fahimi, Parvaneh Joharinad, Jürgen Jost, Janis Keck, Thomas Jan Mikhail\",\"doi\":\"10.1007/s10485-025-09827-x\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"<div><p>In this article, we expand upon the concepts introduced in Spivak (Metric realization of fuzzy simplicial sets, 2009. http://www.dspivak.net/metric_realization090922.pdf) about the relationship between the category <span>\\\\(\\\\textbf{UM}\\\\)</span> of uber metric spaces and the category <span>\\\\(\\\\textbf{sFuz}\\\\)</span> of fuzzy simplicial sets. We show that fuzzy simplicial sets can be regarded as natural combinatorial generalizations of metric relations. Furthermore, we take inspiration from UMAP (McInnes et al, in: Umap: Uniform manifold approximation and projection for dimension reduction, 2018) to apply the theory to manifold learning, dimension reduction and data visualization, while refining some of their constructions to put the corresponding theory on a more solid footing. A generalization of the adjunction between <span>\\\\(\\\\textbf{UM}\\\\)</span> and <span>\\\\(\\\\textbf{sFuz}\\\\)</span> allows us to view the adjunctions used in both publications as special cases. Moreover, we derive an explicit description of colimits in <span>\\\\(\\\\textbf{UM}\\\\)</span> and the realization functor <span>\\\\(\\\\text {Re}:\\\\textbf{sFuz}\\\\rightarrow \\\\textbf{UM}\\\\)</span>, and show that <span>\\\\(\\\\textbf{UM}\\\\)</span> can be embedded into <span>\\\\(\\\\textbf{sFuz}\\\\)</span>. Furthermore, we prove analogous results for the category of extended-pseudo metric spaces <span>\\\\(\\\\textbf{EPMet}\\\\)</span>. We also provide rigorous definitions of functors that make it possible to recursively merge sets of fuzzy simplicial sets and provide a description of the adjunctions between the category of truncated fuzzy simplicial sets and <span>\\\\(\\\\textbf{sFuz}\\\\)</span>, which we relate to persistent homology. Combining those constructions, we can show a surprising connection between the well-known dimension reduction methods UMAP and Isomap (Tenenbaum et al. in Science 290(5500):2319–2323, 2000) and derive an alternative algorithm, which we call IsUMap, that combines some of the strengths of both methods. Additionally, we developed a new embedding method that allows to preserve clusters detected in the original metric space that we construct from the data. The visualization of the optimization process gives the user information, both about the inner-cluster distributions in the original metric space and their inter-cluster relations. We compare our new method with UMAP, Isomap and t-SNE on a series of low- and high-dimensional datasets and provide explanations for observed differences and improvements.</p></div>\",\"PeriodicalId\":7952,\"journal\":{\"name\":\"Applied Categorical Structures\",\"volume\":\"33 5\",\"pages\":\"\"},\"PeriodicalIF\":0.5000,\"publicationDate\":\"2025-09-10\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"https://link.springer.com/content/pdf/10.1007/s10485-025-09827-x.pdf\",\"citationCount\":\"0\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"Applied Categorical Structures\",\"FirstCategoryId\":\"100\",\"ListUrlMain\":\"https://link.springer.com/article/10.1007/s10485-025-09827-x\",\"RegionNum\":4,\"RegionCategory\":\"数学\",\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"Q3\",\"JCRName\":\"MATHEMATICS\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"Applied Categorical Structures","FirstCategoryId":"100","ListUrlMain":"https://link.springer.com/article/10.1007/s10485-025-09827-x","RegionNum":4,"RegionCategory":"数学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q3","JCRName":"MATHEMATICS","Score":null,"Total":0}
引用次数: 0
摘要
在本文中,我们扩展了Spivak(模糊简单集的度量实现,2009)中引入的概念。http://www.dspivak.net/metric_realization090922.pdf)关于超度量空间的范畴\(\textbf{UM}\)与模糊简单集的范畴\(\textbf{sFuz}\)之间的关系。我们证明模糊简单集可以看作度量关系的自然组合推广。此外,我们从UMAP (McInnes等人,在:UMAP:统一流形近似和投影降维,2018)中获得灵感,将该理论应用于流形学习、降维和数据可视化,同时改进它们的一些结构,使相应的理论建立在更坚实的基础上。对\(\textbf{UM}\)和\(\textbf{sFuz}\)之间的连词的概括使我们可以将这两篇文章中使用的连词视为特殊情况。此外,我们还推导出了\(\textbf{UM}\)中极限的显式描述和实现函子\(\text {Re}:\textbf{sFuz}\rightarrow \textbf{UM}\),并证明了\(\textbf{UM}\)可以嵌入到\(\textbf{sFuz}\)中。进一步,我们证明了扩展伪度量空间范畴\(\textbf{EPMet}\)的类似结果。我们还提供了函子的严格定义,使递归归并模糊简单集集成为可能,并提供了截断模糊简单集与\(\textbf{sFuz}\)之间的辅词的描述,我们将其与持久同调联系起来。结合这些结构,我们可以显示出众所周知的降维方法UMAP和Isomap之间的惊人联系(Tenenbaum et al. in Science 290(5500):2319 - 2323,2000),并推导出一种替代算法,我们称之为IsUMap,它结合了两种方法的一些优势。此外,我们开发了一种新的嵌入方法,允许保留我们从数据构建的原始度量空间中检测到的聚类。优化过程的可视化为用户提供了原始度量空间中簇内分布及其簇间关系的信息。我们将新方法与UMAP、Isomap和t-SNE在一系列低维和高维数据集上进行了比较,并对观察到的差异和改进进行了解释。
Fuzzy Simplicial Sets and Their Application to Geometric Data Analysis
In this article, we expand upon the concepts introduced in Spivak (Metric realization of fuzzy simplicial sets, 2009. http://www.dspivak.net/metric_realization090922.pdf) about the relationship between the category \(\textbf{UM}\) of uber metric spaces and the category \(\textbf{sFuz}\) of fuzzy simplicial sets. We show that fuzzy simplicial sets can be regarded as natural combinatorial generalizations of metric relations. Furthermore, we take inspiration from UMAP (McInnes et al, in: Umap: Uniform manifold approximation and projection for dimension reduction, 2018) to apply the theory to manifold learning, dimension reduction and data visualization, while refining some of their constructions to put the corresponding theory on a more solid footing. A generalization of the adjunction between \(\textbf{UM}\) and \(\textbf{sFuz}\) allows us to view the adjunctions used in both publications as special cases. Moreover, we derive an explicit description of colimits in \(\textbf{UM}\) and the realization functor \(\text {Re}:\textbf{sFuz}\rightarrow \textbf{UM}\), and show that \(\textbf{UM}\) can be embedded into \(\textbf{sFuz}\). Furthermore, we prove analogous results for the category of extended-pseudo metric spaces \(\textbf{EPMet}\). We also provide rigorous definitions of functors that make it possible to recursively merge sets of fuzzy simplicial sets and provide a description of the adjunctions between the category of truncated fuzzy simplicial sets and \(\textbf{sFuz}\), which we relate to persistent homology. Combining those constructions, we can show a surprising connection between the well-known dimension reduction methods UMAP and Isomap (Tenenbaum et al. in Science 290(5500):2319–2323, 2000) and derive an alternative algorithm, which we call IsUMap, that combines some of the strengths of both methods. Additionally, we developed a new embedding method that allows to preserve clusters detected in the original metric space that we construct from the data. The visualization of the optimization process gives the user information, both about the inner-cluster distributions in the original metric space and their inter-cluster relations. We compare our new method with UMAP, Isomap and t-SNE on a series of low- and high-dimensional datasets and provide explanations for observed differences and improvements.
期刊介绍:
Applied Categorical Structures focuses on applications of results, techniques and ideas from category theory to mathematics, physics and computer science. These include the study of topological and algebraic categories, representation theory, algebraic geometry, homological and homotopical algebra, derived and triangulated categories, categorification of (geometric) invariants, categorical investigations in mathematical physics, higher category theory and applications, categorical investigations in functional analysis, in continuous order theory and in theoretical computer science. In addition, the journal also follows the development of emerging fields in which the application of categorical methods proves to be relevant.
Applied Categorical Structures publishes both carefully refereed research papers and survey papers. It promotes communication and increases the dissemination of new results and ideas among mathematicians and computer scientists who use categorical methods in their research.