贝塔-西格玛 VAE：分离高斯变异自动编码器中的贝塔方差和解码器方差

arXiv - STAT - Machine Learning Pub Date : 2024-09-14 DOI:arxiv-2409.09361

Seunghwan Kim, Seungkyu Lee

{"title":"贝塔-西格玛 VAE：分离高斯变异自动编码器中的贝塔方差和解码器方差","authors":"Seunghwan Kim, Seungkyu Lee","doi":"arxiv-2409.09361","DOIUrl":null,"url":null,"abstract":"Variational autoencoder (VAE) is an established generative model but is\nnotorious for its blurriness. In this work, we investigate the blurry output\nproblem of VAE and resolve it, exploiting the variance of Gaussian decoder and\n$\\beta$ of beta-VAE. Specifically, we reveal that the indistinguishability of\ndecoder variance and $\\beta$ hinders appropriate analysis of the model by\nrandom likelihood value, and limits performance improvement by omitting the\ngain from $\\beta$. To address the problem, we propose Beta-Sigma VAE (BS-VAE)\nthat explicitly separates $\\beta$ and decoder variance $\\sigma^2_x$ in the\nmodel. Our method demonstrates not only superior performance in natural image\nsynthesis but also controllable parameters and predictable analysis compared to\nconventional VAE. In our experimental evaluation, we employ the analysis of\nrate-distortion curve and proxy metrics on computer vision datasets. The code\nis available on https://github.com/overnap/BS-VAE","PeriodicalId":501340,"journal":{"name":"arXiv - STAT - Machine Learning","volume":"23 1","pages":""},"PeriodicalIF":0.0000,"publicationDate":"2024-09-14","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":"{\"title\":\"Beta-Sigma VAE: Separating beta and decoder variance in Gaussian variational autoencoder\",\"authors\":\"Seunghwan Kim, Seungkyu Lee\",\"doi\":\"arxiv-2409.09361\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"Variational autoencoder (VAE) is an established generative model but is\\nnotorious for its blurriness. In this work, we investigate the blurry output\\nproblem of VAE and resolve it, exploiting the variance of Gaussian decoder and\\n$\\\\beta$ of beta-VAE. Specifically, we reveal that the indistinguishability of\\ndecoder variance and $\\\\beta$ hinders appropriate analysis of the model by\\nrandom likelihood value, and limits performance improvement by omitting the\\ngain from $\\\\beta$. To address the problem, we propose Beta-Sigma VAE (BS-VAE)\\nthat explicitly separates $\\\\beta$ and decoder variance $\\\\sigma^2_x$ in the\\nmodel. Our method demonstrates not only superior performance in natural image\\nsynthesis but also controllable parameters and predictable analysis compared to\\nconventional VAE. In our experimental evaluation, we employ the analysis of\\nrate-distortion curve and proxy metrics on computer vision datasets. The code\\nis available on https://github.com/overnap/BS-VAE\",\"PeriodicalId\":501340,\"journal\":{\"name\":\"arXiv - STAT - Machine Learning\",\"volume\":\"23 1\",\"pages\":\"\"},\"PeriodicalIF\":0.0000,\"publicationDate\":\"2024-09-14\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"0\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"arXiv - STAT - Machine Learning\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/arxiv-2409.09361\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"arXiv - STAT - Machine Learning","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/arxiv-2409.09361","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}

引用次数: 0

摘要

变异自动编码器（VAE）是一种成熟的生成模型，但因其模糊性而臭名昭著。在这项工作中，我们利用高斯解码器的方差和贝塔自编码器的贝塔值，研究并解决了自编码器输出模糊的问题。具体来说，我们发现解码器方差和 $\beta$ 的不可分性阻碍了通过随机似然值对模型进行适当的分析，并限制了通过省略 $\beta$ 的增益来提高性能。为了解决这个问题，我们提出了 Beta-Sigma VAE（BS-VAE），它在模型中明确分离了 $\beta$ 和解码器方差 $\sigma^2_x$。与传统的 VAE 相比，我们的方法不仅在自然图像合成中表现出卓越的性能，而且参数可控、分析可预测。在实验评估中，我们采用了计算机视觉数据集上的rate-distortion 曲线分析和代理度量。代码见 https://github.com/overnap/BS-VAE

本文章由计算机程序翻译，如有差异，请以英文原文为准。

查看原文本刊更多论文

Beta-Sigma VAE: Separating beta and decoder variance in Gaussian variational autoencoder

Variational autoencoder (VAE) is an established generative model but is notorious for its blurriness. In this work, we investigate the blurry output problem of VAE and resolve it, exploiting the variance of Gaussian decoder and $\beta$ of beta-VAE. Specifically, we reveal that the indistinguishability of decoder variance and $\beta$ hinders appropriate analysis of the model by random likelihood value, and limits performance improvement by omitting the gain from $\beta$. To address the problem, we propose Beta-Sigma VAE (BS-VAE) that explicitly separates $\beta$ and decoder variance $\sigma^2_x$ in the model. Our method demonstrates not only superior performance in natural image synthesis but also controllable parameters and predictable analysis compared to conventional VAE. In our experimental evaluation, we employ the analysis of rate-distortion curve and proxy metrics on computer vision datasets. The code is available on https://github.com/overnap/BS-VAE

求助全文

通过发布文献求助，成功后即可免费获取论文全文。去求助

来源期刊

arXiv - STAT - Machine Learning

自引率

0.00%

发文量