PT-VAE:具有先验概念转换的变分自编码器

IF 5.5 2区 计算机科学 Q1 COMPUTER SCIENCE, ARTIFICIAL INTELLIGENCE
Zitu Liu , Yue Liu , Zhenyao Yu , Zhengwei Yang , Qingshan Fu , Yike Guo , Qun Liu , Guoyin Wang
{"title":"PT-VAE:具有先验概念转换的变分自编码器","authors":"Zitu Liu ,&nbsp;Yue Liu ,&nbsp;Zhenyao Yu ,&nbsp;Zhengwei Yang ,&nbsp;Qingshan Fu ,&nbsp;Yike Guo ,&nbsp;Qun Liu ,&nbsp;Guoyin Wang","doi":"10.1016/j.neucom.2025.130129","DOIUrl":null,"url":null,"abstract":"<div><div>Learning and disentangling coherent latent representations of variational autoencoders (VAEs) have recently attracted widespread attention. However, the latent space of the VAE model is constrained by the prior distribution, which can hinder the latent variables from accurately capturing semantic information, thereby limiting its disentanglement and interpretability. This paper proposes PT-VAE, which constructs the latent space by a well-constructed latent space rather than a carefully designed prior distribution to guide the latent variables. Firstly, we transform the initial constraints of the latent space into understandable latent variable distributions, the so-called prior concept, which can be introduced into the latent space. Then, we design the Gumbel softmax reparameterization trick to enhance the integration of the prior concept and latent variables. Furthermore, the training process of PT-VAE is guided by deriving a variational lower bound, which facilitates the construction of the latent space concept based on the prior concept. Compared with 8 state-of-the-art VAE models, the PT-VAE improves the average clustering accuracy by over 11 % on the Fashion MNSIT, MNIST, COIL20, and COIL10 datasets. Moreover, the PT-VAE elucidates the process of information aggregation within the model and uncovers disentangled representations. PT-VAE provides a novel and flexible approach to construct an interpretable latent space by embedding prior concepts and disentangling the latent variables.</div></div>","PeriodicalId":19268,"journal":{"name":"Neurocomputing","volume":"638 ","pages":"Article 130129"},"PeriodicalIF":5.5000,"publicationDate":"2025-04-05","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":"{\"title\":\"PT-VAE: Variational autoencoder with prior concept transformation\",\"authors\":\"Zitu Liu ,&nbsp;Yue Liu ,&nbsp;Zhenyao Yu ,&nbsp;Zhengwei Yang ,&nbsp;Qingshan Fu ,&nbsp;Yike Guo ,&nbsp;Qun Liu ,&nbsp;Guoyin Wang\",\"doi\":\"10.1016/j.neucom.2025.130129\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"<div><div>Learning and disentangling coherent latent representations of variational autoencoders (VAEs) have recently attracted widespread attention. However, the latent space of the VAE model is constrained by the prior distribution, which can hinder the latent variables from accurately capturing semantic information, thereby limiting its disentanglement and interpretability. This paper proposes PT-VAE, which constructs the latent space by a well-constructed latent space rather than a carefully designed prior distribution to guide the latent variables. Firstly, we transform the initial constraints of the latent space into understandable latent variable distributions, the so-called prior concept, which can be introduced into the latent space. Then, we design the Gumbel softmax reparameterization trick to enhance the integration of the prior concept and latent variables. Furthermore, the training process of PT-VAE is guided by deriving a variational lower bound, which facilitates the construction of the latent space concept based on the prior concept. Compared with 8 state-of-the-art VAE models, the PT-VAE improves the average clustering accuracy by over 11 % on the Fashion MNSIT, MNIST, COIL20, and COIL10 datasets. Moreover, the PT-VAE elucidates the process of information aggregation within the model and uncovers disentangled representations. PT-VAE provides a novel and flexible approach to construct an interpretable latent space by embedding prior concepts and disentangling the latent variables.</div></div>\",\"PeriodicalId\":19268,\"journal\":{\"name\":\"Neurocomputing\",\"volume\":\"638 \",\"pages\":\"Article 130129\"},\"PeriodicalIF\":5.5000,\"publicationDate\":\"2025-04-05\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"0\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"Neurocomputing\",\"FirstCategoryId\":\"94\",\"ListUrlMain\":\"https://www.sciencedirect.com/science/article/pii/S092523122500801X\",\"RegionNum\":2,\"RegionCategory\":\"计算机科学\",\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"Q1\",\"JCRName\":\"COMPUTER SCIENCE, ARTIFICIAL INTELLIGENCE\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"Neurocomputing","FirstCategoryId":"94","ListUrlMain":"https://www.sciencedirect.com/science/article/pii/S092523122500801X","RegionNum":2,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q1","JCRName":"COMPUTER SCIENCE, ARTIFICIAL INTELLIGENCE","Score":null,"Total":0}
引用次数: 0

摘要

变分自编码器(VAEs)的相干潜在表征的学习和解纠缠近年来引起了广泛的关注。然而,VAE模型的潜在空间受到先验分布的限制,这阻碍了潜在变量对语义信息的准确捕获,从而限制了其解纠缠性和可解释性。本文提出了PT-VAE,它不是通过精心设计的先验分布来引导潜变量,而是通过精心构建的潜空间来构建潜空间。首先,我们将潜在空间的初始约束转换为可理解的潜在变量分布,即所谓的先验概念,可以引入到潜在空间中。然后,我们设计了Gumbel softmax重参数化技巧来增强先验概念和潜在变量的集成。此外,PT-VAE的训练过程通过推导变分下界来指导,这有助于在先验概念的基础上构建潜在空间概念。与8个最先进的VAE模型相比,PT-VAE在Fashion MNSIT, MNIST, COIL20和COIL10数据集上的平均聚类精度提高了11. %以上。此外,PT-VAE阐明了模型内信息聚合的过程,并揭示了解纠缠的表示。PT-VAE通过嵌入先验概念和解纠缠潜变量,为构建可解释潜空间提供了一种新颖而灵活的方法。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
PT-VAE: Variational autoencoder with prior concept transformation
Learning and disentangling coherent latent representations of variational autoencoders (VAEs) have recently attracted widespread attention. However, the latent space of the VAE model is constrained by the prior distribution, which can hinder the latent variables from accurately capturing semantic information, thereby limiting its disentanglement and interpretability. This paper proposes PT-VAE, which constructs the latent space by a well-constructed latent space rather than a carefully designed prior distribution to guide the latent variables. Firstly, we transform the initial constraints of the latent space into understandable latent variable distributions, the so-called prior concept, which can be introduced into the latent space. Then, we design the Gumbel softmax reparameterization trick to enhance the integration of the prior concept and latent variables. Furthermore, the training process of PT-VAE is guided by deriving a variational lower bound, which facilitates the construction of the latent space concept based on the prior concept. Compared with 8 state-of-the-art VAE models, the PT-VAE improves the average clustering accuracy by over 11 % on the Fashion MNSIT, MNIST, COIL20, and COIL10 datasets. Moreover, the PT-VAE elucidates the process of information aggregation within the model and uncovers disentangled representations. PT-VAE provides a novel and flexible approach to construct an interpretable latent space by embedding prior concepts and disentangling the latent variables.
求助全文
通过发布文献求助,成功后即可免费获取论文全文。 去求助
来源期刊
Neurocomputing
Neurocomputing 工程技术-计算机:人工智能
CiteScore
13.10
自引率
10.00%
发文量
1382
审稿时长
70 days
期刊介绍: Neurocomputing publishes articles describing recent fundamental contributions in the field of neurocomputing. Neurocomputing theory, practice and applications are the essential topics being covered.
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
确定
请完成安全验证×
copy
已复制链接
快去分享给好友吧!
我知道了
右上角分享
点击右上角分享
0
联系我们:info@booksci.cn Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。 Copyright © 2023 布克学术 All rights reserved.
京ICP备2023020795号-1
ghs 京公网安备 11010802042870号
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术官方微信