Posterior Collapse and Latent Variable Non-identifiability

Advances in neural information processing systems Pub Date : 2023-01-02 DOI:10.48550/arXiv.2301.00537

Yixin Wang, D. Blei, J. Cunningham

{"title":"Posterior Collapse and Latent Variable Non-identifiability","authors":"Yixin Wang, D. Blei, J. Cunningham","doi":"10.48550/arXiv.2301.00537","DOIUrl":null,"url":null,"abstract":"Variational autoencoders model high-dimensional data by positing low-dimensional latent variables that are mapped through a flexible distribution parametrized by a neural network. Unfortunately, variational autoencoders often suffer from posterior collapse: the posterior of the latent variables is equal to its prior, rendering the variational autoencoder useless as a means to produce meaningful representations. Existing approaches to posterior collapse often attribute it to the use of neural networks or optimization issues due to variational approximation. In this paper, we consider posterior collapse as a problem of latent variable non-identifiability. We prove that the posterior collapses if and only if the latent variables are non-identifiable in the generative model. This fact implies that posterior collapse is not a phenomenon specific to the use of flexible distributions or approximate inference. Rather, it can occur in classical probabilistic models even with exact inference, which we also demonstrate. Based on these results, we propose a class of latent-identifiable variational autoencoders, deep generative models which enforce identifiability without sacrificing flexibility. This model class resolves the problem of latent variable non-identifiability by leveraging bijective Brenier maps and parameterizing them with input convex neural networks, without special variational inference objectives or optimization tricks. Across synthetic and real datasets, latent-identifiable variational autoencoders outperform existing methods in mitigating posterior collapse and providing meaningful representations of the data.","PeriodicalId":72099,"journal":{"name":"Advances in neural information processing systems","volume":"1 1","pages":"5443-5455"},"PeriodicalIF":0.0000,"publicationDate":"2023-01-02","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"35","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Advances in neural information processing systems","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.48550/arXiv.2301.00537","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}

引用次数: 35

Abstract

Variational autoencoders model high-dimensional data by positing low-dimensional latent variables that are mapped through a flexible distribution parametrized by a neural network. Unfortunately, variational autoencoders often suffer from posterior collapse: the posterior of the latent variables is equal to its prior, rendering the variational autoencoder useless as a means to produce meaningful representations. Existing approaches to posterior collapse often attribute it to the use of neural networks or optimization issues due to variational approximation. In this paper, we consider posterior collapse as a problem of latent variable non-identifiability. We prove that the posterior collapses if and only if the latent variables are non-identifiable in the generative model. This fact implies that posterior collapse is not a phenomenon specific to the use of flexible distributions or approximate inference. Rather, it can occur in classical probabilistic models even with exact inference, which we also demonstrate. Based on these results, we propose a class of latent-identifiable variational autoencoders, deep generative models which enforce identifiability without sacrificing flexibility. This model class resolves the problem of latent variable non-identifiability by leveraging bijective Brenier maps and parameterizing them with input convex neural networks, without special variational inference objectives or optimization tricks. Across synthetic and real datasets, latent-identifiable variational autoencoders outperform existing methods in mitigating posterior collapse and providing meaningful representations of the data.

查看原文本刊更多论文

后塌陷和潜在变量不可识别性

变分自编码器通过设置低维潜在变量来建模高维数据，这些潜在变量通过神经网络参数化的灵活分布来映射。不幸的是，变分自编码器经常遭受后验崩溃:潜在变量的后验等于其先验，使得变分自编码器无法作为产生有意义表示的手段。现有的后验崩溃方法通常将其归因于神经网络的使用或由于变分逼近而导致的优化问题。本文将后验崩溃视为一个潜在变量不可辨识的问题。我们证明当且仅当生成模型中的潜在变量不可识别时，后验崩溃。这一事实表明，后验坍缩并不是使用灵活分布或近似推理所特有的现象。相反，它可以发生在经典概率模型中，即使有精确的推理，我们也证明了这一点。基于这些结果，我们提出了一类潜在可识别的变分自编码器，深度生成模型在不牺牲灵活性的情况下增强可识别性。该模型类通过利用双目标Brenier映射并使用输入凸神经网络参数化它们来解决潜在变量不可识别的问题，而不需要特殊的变分推理目标或优化技巧。在合成和真实数据集中，潜在可识别的变分自编码器在减轻后验崩溃和提供有意义的数据表示方面优于现有方法。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

求助全文

约1分钟内获得全文求助全文

来源期刊

Advances in neural information processing systems

自引率

0.00%

发文量