贝塔-西格玛 VAE:分离高斯变异自动编码器中的贝塔方差和解码器方差

Seunghwan Kim, Seungkyu Lee
{"title":"贝塔-西格玛 VAE:分离高斯变异自动编码器中的贝塔方差和解码器方差","authors":"Seunghwan Kim, Seungkyu Lee","doi":"arxiv-2409.09361","DOIUrl":null,"url":null,"abstract":"Variational autoencoder (VAE) is an established generative model but is\nnotorious for its blurriness. In this work, we investigate the blurry output\nproblem of VAE and resolve it, exploiting the variance of Gaussian decoder and\n$\\beta$ of beta-VAE. Specifically, we reveal that the indistinguishability of\ndecoder variance and $\\beta$ hinders appropriate analysis of the model by\nrandom likelihood value, and limits performance improvement by omitting the\ngain from $\\beta$. To address the problem, we propose Beta-Sigma VAE (BS-VAE)\nthat explicitly separates $\\beta$ and decoder variance $\\sigma^2_x$ in the\nmodel. Our method demonstrates not only superior performance in natural image\nsynthesis but also controllable parameters and predictable analysis compared to\nconventional VAE. In our experimental evaluation, we employ the analysis of\nrate-distortion curve and proxy metrics on computer vision datasets. The code\nis available on https://github.com/overnap/BS-VAE","PeriodicalId":501340,"journal":{"name":"arXiv - STAT - Machine Learning","volume":null,"pages":null},"PeriodicalIF":0.0000,"publicationDate":"2024-09-14","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":"{\"title\":\"Beta-Sigma VAE: Separating beta and decoder variance in Gaussian variational autoencoder\",\"authors\":\"Seunghwan Kim, Seungkyu Lee\",\"doi\":\"arxiv-2409.09361\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"Variational autoencoder (VAE) is an established generative model but is\\nnotorious for its blurriness. In this work, we investigate the blurry output\\nproblem of VAE and resolve it, exploiting the variance of Gaussian decoder and\\n$\\\\beta$ of beta-VAE. Specifically, we reveal that the indistinguishability of\\ndecoder variance and $\\\\beta$ hinders appropriate analysis of the model by\\nrandom likelihood value, and limits performance improvement by omitting the\\ngain from $\\\\beta$. To address the problem, we propose Beta-Sigma VAE (BS-VAE)\\nthat explicitly separates $\\\\beta$ and decoder variance $\\\\sigma^2_x$ in the\\nmodel. Our method demonstrates not only superior performance in natural image\\nsynthesis but also controllable parameters and predictable analysis compared to\\nconventional VAE. In our experimental evaluation, we employ the analysis of\\nrate-distortion curve and proxy metrics on computer vision datasets. The code\\nis available on https://github.com/overnap/BS-VAE\",\"PeriodicalId\":501340,\"journal\":{\"name\":\"arXiv - STAT - Machine Learning\",\"volume\":null,\"pages\":null},\"PeriodicalIF\":0.0000,\"publicationDate\":\"2024-09-14\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"0\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"arXiv - STAT - Machine Learning\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/arxiv-2409.09361\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"arXiv - STAT - Machine Learning","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/arxiv-2409.09361","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 0

摘要

变异自动编码器(VAE)是一种成熟的生成模型,但因其模糊性而臭名昭著。在这项工作中,我们利用高斯解码器的方差和贝塔自编码器的贝塔值,研究并解决了自编码器输出模糊的问题。具体来说,我们发现解码器方差和 $\beta$ 的不可分性阻碍了通过随机似然值对模型进行适当的分析,并限制了通过省略 $\beta$ 的增益来提高性能。为了解决这个问题,我们提出了 Beta-Sigma VAE(BS-VAE),它在模型中明确分离了 $\beta$ 和解码器方差 $\sigma^2_x$。与传统的 VAE 相比,我们的方法不仅在自然图像合成中表现出卓越的性能,而且参数可控、分析可预测。在实验评估中,我们采用了计算机视觉数据集上的rate-distortion 曲线分析和代理度量。代码见 https://github.com/overnap/BS-VAE
本文章由计算机程序翻译,如有差异,请以英文原文为准。
Beta-Sigma VAE: Separating beta and decoder variance in Gaussian variational autoencoder
Variational autoencoder (VAE) is an established generative model but is notorious for its blurriness. In this work, we investigate the blurry output problem of VAE and resolve it, exploiting the variance of Gaussian decoder and $\beta$ of beta-VAE. Specifically, we reveal that the indistinguishability of decoder variance and $\beta$ hinders appropriate analysis of the model by random likelihood value, and limits performance improvement by omitting the gain from $\beta$. To address the problem, we propose Beta-Sigma VAE (BS-VAE) that explicitly separates $\beta$ and decoder variance $\sigma^2_x$ in the model. Our method demonstrates not only superior performance in natural image synthesis but also controllable parameters and predictable analysis compared to conventional VAE. In our experimental evaluation, we employ the analysis of rate-distortion curve and proxy metrics on computer vision datasets. The code is available on https://github.com/overnap/BS-VAE
求助全文
通过发布文献求助,成功后即可免费获取论文全文。 去求助
来源期刊
自引率
0.00%
发文量
0
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
确定
请完成安全验证×
copy
已复制链接
快去分享给好友吧!
我知道了
右上角分享
点击右上角分享
0
联系我们:info@booksci.cn Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。 Copyright © 2023 布克学术 All rights reserved.
京ICP备2023020795号-1
ghs 京公网安备 11010802042870号
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术官方微信