Zitu Liu , Yue Liu , Zhenyao Yu , Zhengwei Yang , Qingshan Fu , Yike Guo , Qun Liu , Guoyin Wang
{"title":"PT-VAE:具有先验概念转换的变分自编码器","authors":"Zitu Liu , Yue Liu , Zhenyao Yu , Zhengwei Yang , Qingshan Fu , Yike Guo , Qun Liu , Guoyin Wang","doi":"10.1016/j.neucom.2025.130129","DOIUrl":null,"url":null,"abstract":"<div><div>Learning and disentangling coherent latent representations of variational autoencoders (VAEs) have recently attracted widespread attention. However, the latent space of the VAE model is constrained by the prior distribution, which can hinder the latent variables from accurately capturing semantic information, thereby limiting its disentanglement and interpretability. This paper proposes PT-VAE, which constructs the latent space by a well-constructed latent space rather than a carefully designed prior distribution to guide the latent variables. Firstly, we transform the initial constraints of the latent space into understandable latent variable distributions, the so-called prior concept, which can be introduced into the latent space. Then, we design the Gumbel softmax reparameterization trick to enhance the integration of the prior concept and latent variables. Furthermore, the training process of PT-VAE is guided by deriving a variational lower bound, which facilitates the construction of the latent space concept based on the prior concept. Compared with 8 state-of-the-art VAE models, the PT-VAE improves the average clustering accuracy by over 11 % on the Fashion MNSIT, MNIST, COIL20, and COIL10 datasets. Moreover, the PT-VAE elucidates the process of information aggregation within the model and uncovers disentangled representations. PT-VAE provides a novel and flexible approach to construct an interpretable latent space by embedding prior concepts and disentangling the latent variables.</div></div>","PeriodicalId":19268,"journal":{"name":"Neurocomputing","volume":"638 ","pages":"Article 130129"},"PeriodicalIF":5.5000,"publicationDate":"2025-04-05","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":"{\"title\":\"PT-VAE: Variational autoencoder with prior concept transformation\",\"authors\":\"Zitu Liu , Yue Liu , Zhenyao Yu , Zhengwei Yang , Qingshan Fu , Yike Guo , Qun Liu , Guoyin Wang\",\"doi\":\"10.1016/j.neucom.2025.130129\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"<div><div>Learning and disentangling coherent latent representations of variational autoencoders (VAEs) have recently attracted widespread attention. However, the latent space of the VAE model is constrained by the prior distribution, which can hinder the latent variables from accurately capturing semantic information, thereby limiting its disentanglement and interpretability. This paper proposes PT-VAE, which constructs the latent space by a well-constructed latent space rather than a carefully designed prior distribution to guide the latent variables. Firstly, we transform the initial constraints of the latent space into understandable latent variable distributions, the so-called prior concept, which can be introduced into the latent space. Then, we design the Gumbel softmax reparameterization trick to enhance the integration of the prior concept and latent variables. Furthermore, the training process of PT-VAE is guided by deriving a variational lower bound, which facilitates the construction of the latent space concept based on the prior concept. Compared with 8 state-of-the-art VAE models, the PT-VAE improves the average clustering accuracy by over 11 % on the Fashion MNSIT, MNIST, COIL20, and COIL10 datasets. Moreover, the PT-VAE elucidates the process of information aggregation within the model and uncovers disentangled representations. PT-VAE provides a novel and flexible approach to construct an interpretable latent space by embedding prior concepts and disentangling the latent variables.</div></div>\",\"PeriodicalId\":19268,\"journal\":{\"name\":\"Neurocomputing\",\"volume\":\"638 \",\"pages\":\"Article 130129\"},\"PeriodicalIF\":5.5000,\"publicationDate\":\"2025-04-05\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"0\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"Neurocomputing\",\"FirstCategoryId\":\"94\",\"ListUrlMain\":\"https://www.sciencedirect.com/science/article/pii/S092523122500801X\",\"RegionNum\":2,\"RegionCategory\":\"计算机科学\",\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"Q1\",\"JCRName\":\"COMPUTER SCIENCE, ARTIFICIAL INTELLIGENCE\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"Neurocomputing","FirstCategoryId":"94","ListUrlMain":"https://www.sciencedirect.com/science/article/pii/S092523122500801X","RegionNum":2,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q1","JCRName":"COMPUTER SCIENCE, ARTIFICIAL INTELLIGENCE","Score":null,"Total":0}
PT-VAE: Variational autoencoder with prior concept transformation
Learning and disentangling coherent latent representations of variational autoencoders (VAEs) have recently attracted widespread attention. However, the latent space of the VAE model is constrained by the prior distribution, which can hinder the latent variables from accurately capturing semantic information, thereby limiting its disentanglement and interpretability. This paper proposes PT-VAE, which constructs the latent space by a well-constructed latent space rather than a carefully designed prior distribution to guide the latent variables. Firstly, we transform the initial constraints of the latent space into understandable latent variable distributions, the so-called prior concept, which can be introduced into the latent space. Then, we design the Gumbel softmax reparameterization trick to enhance the integration of the prior concept and latent variables. Furthermore, the training process of PT-VAE is guided by deriving a variational lower bound, which facilitates the construction of the latent space concept based on the prior concept. Compared with 8 state-of-the-art VAE models, the PT-VAE improves the average clustering accuracy by over 11 % on the Fashion MNSIT, MNIST, COIL20, and COIL10 datasets. Moreover, the PT-VAE elucidates the process of information aggregation within the model and uncovers disentangled representations. PT-VAE provides a novel and flexible approach to construct an interpretable latent space by embedding prior concepts and disentangling the latent variables.
期刊介绍:
Neurocomputing publishes articles describing recent fundamental contributions in the field of neurocomputing. Neurocomputing theory, practice and applications are the essential topics being covered.