Analyzing Hierarchical Relationships and Quality of Embedding in Latent Space

IEEE transactions on artificial intelligence Pub Date : 2024-11-13 DOI:10.1109/TAI.2024.3497921

Ankita Chatterjee;Jayanta Mukherjee;Partha Pratim Das

{"title":"Analyzing Hierarchical Relationships and Quality of Embedding in Latent Space","authors":"Ankita Chatterjee;Jayanta Mukherjee;Partha Pratim Das","doi":"10.1109/TAI.2024.3497921","DOIUrl":null,"url":null,"abstract":"Existing learning models partition the generated representations using hyperplanes which form well defined groups of similar embeddings that is uniquely mapped to a particular class. However, in practical applications, the embedding space does not form distinct boundaries to segregate the class representations. There exists interaction among similar classes which cannot be visually determined in high-dimensional space. Moreover, the structure of the latent space remains obscure. As learned representations are frequently reused to reduce the inference time, it is important to analyse how semantically related classes interact among themselves in the latent space. Therefore, we propose a boundary estimation algorithm that minimises the inclusion of other classes in the embedding space to form groups of similar representations and compare the quality of these class embeddings for various models in an already encoded space. These groups are overlapping to denote ambiguous embeddings that cannot be mapped to a particular class with high confidence. The algorithm determines which representations to be included or discarded to form well defined regions, separating discriminating, ambiguous and rejected embeddings to depict a particular class. Later, we construct relation trees to evaluate the hierarchical relationships formed among the classes, and compare it with the <italic>WordNet</i> ontology using phylogenetic tree comparison methods.","PeriodicalId":73305,"journal":{"name":"IEEE transactions on artificial intelligence","volume":"6 4","pages":"843-858"},"PeriodicalIF":0.0000,"publicationDate":"2024-11-13","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"IEEE transactions on artificial intelligence","FirstCategoryId":"1085","ListUrlMain":"https://ieeexplore.ieee.org/document/10752921/","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}

引用次数: 0

Abstract

Existing learning models partition the generated representations using hyperplanes which form well defined groups of similar embeddings that is uniquely mapped to a particular class. However, in practical applications, the embedding space does not form distinct boundaries to segregate the class representations. There exists interaction among similar classes which cannot be visually determined in high-dimensional space. Moreover, the structure of the latent space remains obscure. As learned representations are frequently reused to reduce the inference time, it is important to analyse how semantically related classes interact among themselves in the latent space. Therefore, we propose a boundary estimation algorithm that minimises the inclusion of other classes in the embedding space to form groups of similar representations and compare the quality of these class embeddings for various models in an already encoded space. These groups are overlapping to denote ambiguous embeddings that cannot be mapped to a particular class with high confidence. The algorithm determines which representations to be included or discarded to form well defined regions, separating discriminating, ambiguous and rejected embeddings to depict a particular class. Later, we construct relation trees to evaluate the hierarchical relationships formed among the classes, and compare it with the WordNet ontology using phylogenetic tree comparison methods.

查看原文本刊更多论文

潜空间层次关系及嵌入质量分析

现有的学习模型使用超平面来划分生成的表示，这些超平面形成了定义良好的相似嵌入组，这些嵌入组唯一地映射到特定的类。然而，在实际应用中，嵌入空间并没有形成明确的边界来隔离类表示。相似类之间存在着相互作用，而这种相互作用在高维空间中是无法直观确定的。此外，潜在空间的结构仍然不清楚。由于学习到的表示经常被重用以减少推理时间，因此分析语义相关的类如何在潜在空间中相互作用是很重要的。因此，我们提出了一种边界估计算法，该算法可以最大限度地减少嵌入空间中其他类的包含，以形成相似表示的组，并比较已经编码空间中各种模型的这些类嵌入的质量。这些组是重叠的，以表示无法以高置信度映射到特定类的模糊嵌入。该算法确定要包含或丢弃哪些表示以形成定义良好的区域，分离有区别的，模糊的和被拒绝的嵌入来描述特定的类。然后，我们构建关系树来评估类之间形成的层次关系，并使用系统发育树比较方法将其与WordNet本体进行比较。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

求助全文

约1分钟内获得全文求助全文

来源期刊

IEEE transactions on artificial intelligence

CiteScore

7.70

自引率

0.00%

发文量