Toward Improving the Generation Quality of Autoregressive Slot VAEs

IF 2.7 4区 计算机科学 Q3 COMPUTER SCIENCE, ARTIFICIAL INTELLIGENCE
Patrick Emami;Pan He;Sanjay Ranka;Anand Rangarajan
{"title":"Toward Improving the Generation Quality of Autoregressive Slot VAEs","authors":"Patrick Emami;Pan He;Sanjay Ranka;Anand Rangarajan","doi":"10.1162/neco_a_01635","DOIUrl":null,"url":null,"abstract":"Unconditional scene inference and generation are challenging to learn jointly with a single compositional model. Despite encouraging progress on models that extract object-centric representations (“slots”) from images, unconditional generation of scenes from slots has received less attention. This is primarily because learning the multiobject relations necessary to imagine coherent scenes is difficult. We hypothesize that most existing slot-based models have a limited ability to learn object correlations. We propose two improvements that strengthen object correlation learning. The first is to condition the slots on a global, scene-level variable that captures higher-order correlations between slots. Second, we address the fundamental lack of a canonical order for objects in images by proposing to learn a consistent order to use for the autoregressive generation of scene objects. Specifically, we train an autoregressive slot prior to sequentially generate scene objects following a learned order. Ordered slot inference entails first estimating a randomly ordered set of slots using existing approaches for extracting slots from images, then aligning those slots to ordered slots generated autoregressively with the slot prior. Our experiments across three multiobject environments demonstrate clear gains in unconditional scene generation quality. Detailed ablation studies are also provided that validate the two proposed improvements.","PeriodicalId":54731,"journal":{"name":"Neural Computation","volume":"36 5","pages":"858-896"},"PeriodicalIF":2.7000,"publicationDate":"2024-04-23","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Neural Computation","FirstCategoryId":"94","ListUrlMain":"https://ieeexplore.ieee.org/document/10535075/","RegionNum":4,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q3","JCRName":"COMPUTER SCIENCE, ARTIFICIAL INTELLIGENCE","Score":null,"Total":0}
引用次数: 0

Abstract

Unconditional scene inference and generation are challenging to learn jointly with a single compositional model. Despite encouraging progress on models that extract object-centric representations (“slots”) from images, unconditional generation of scenes from slots has received less attention. This is primarily because learning the multiobject relations necessary to imagine coherent scenes is difficult. We hypothesize that most existing slot-based models have a limited ability to learn object correlations. We propose two improvements that strengthen object correlation learning. The first is to condition the slots on a global, scene-level variable that captures higher-order correlations between slots. Second, we address the fundamental lack of a canonical order for objects in images by proposing to learn a consistent order to use for the autoregressive generation of scene objects. Specifically, we train an autoregressive slot prior to sequentially generate scene objects following a learned order. Ordered slot inference entails first estimating a randomly ordered set of slots using existing approaches for extracting slots from images, then aligning those slots to ordered slots generated autoregressively with the slot prior. Our experiments across three multiobject environments demonstrate clear gains in unconditional scene generation quality. Detailed ablation studies are also provided that validate the two proposed improvements.
努力提高自回归槽式 VAE 的生成质量
无条件场景推理和生成对使用单一合成模型进行联合学习具有挑战性。尽管在从图像中提取以物体为中心的表征("槽")的模型方面取得了令人鼓舞的进展,但从 "槽 "中无条件生成场景的研究却较少受到关注。这主要是因为学习想象连贯场景所需的多物体关系非常困难。我们假设,大多数现有的基于插槽的模型学习物体相关性的能力有限。我们提出了两个改进方案来加强物体相关性学习。首先,将捕捉槽间高阶相关性的全局场景级变量作为槽的条件。其次,我们针对图像中物体缺乏典型顺序这一根本问题,提出了学习一致的顺序,用于场景物体的自回归生成。具体来说,我们先训练一个自回归插槽,然后按照学习到的顺序依次生成场景对象。有序插槽推理首先需要使用现有的从图像中提取插槽的方法来估计一组随机有序的插槽,然后将这些插槽与使用插槽先验自回归生成的有序插槽对齐。我们在三个多目标环境中进行的实验表明,无条件场景生成质量明显提高。我们还提供了详细的消融研究,验证了这两项改进建议。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
求助全文
约1分钟内获得全文 求助全文
来源期刊
Neural Computation
Neural Computation 工程技术-计算机:人工智能
CiteScore
6.30
自引率
3.40%
发文量
83
审稿时长
3.0 months
期刊介绍: Neural Computation is uniquely positioned at the crossroads between neuroscience and TMCS and welcomes the submission of original papers from all areas of TMCS, including: Advanced experimental design; Analysis of chemical sensor data; Connectomic reconstructions; Analysis of multielectrode and optical recordings; Genetic data for cell identity; Analysis of behavioral data; Multiscale models; Analysis of molecular mechanisms; Neuroinformatics; Analysis of brain imaging data; Neuromorphic engineering; Principles of neural coding, computation, circuit dynamics, and plasticity; Theories of brain function.
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
确定
请完成安全验证×
copy
已复制链接
快去分享给好友吧!
我知道了
右上角分享
点击右上角分享
0
联系我们:info@booksci.cn Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。 Copyright © 2023 布克学术 All rights reserved.
京ICP备2023020795号-1
ghs 京公网安备 11010802042870号
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术官方微信