Exploiting Inductive Bias in Transformers for Unsupervised Disentanglement of Syntax and Semantics with VAEs

North American Chapter of the Association for Computational Linguistics Pub Date : 2022-05-12 DOI:10.48550/arXiv.2205.05943

G. Felhi, Joseph Le Roux, Djamé Seddah

{"title":"Exploiting Inductive Bias in Transformers for Unsupervised Disentanglement of Syntax and Semantics with VAEs","authors":"G. Felhi, Joseph Le Roux, Djamé Seddah","doi":"10.48550/arXiv.2205.05943","DOIUrl":null,"url":null,"abstract":"We propose a generative model for text generation, which exhibits disentangled latent representations of syntax and semantics. Contrary to previous work, this model does not need syntactic information such as constituency parses, or semantic information such as paraphrase pairs. Our model relies solely on the inductive bias found in attention-based architectures such as Transformers. In the attention of Transformers, keys handle information selection while values specify what information is conveyed. Our model, dubbed QKVAE, uses Attention in its decoder to read latent variables where one latent variable infers keys while another infers values. We run experiments on latent representations and experiments on syntax/semantics transfer which show that QKVAE displays clear signs of disentangled syntax and semantics. We also show that our model displays competitive syntax transfer capabilities when compared to supervised models and that comparable supervised models need a fairly large amount of data (more than 50K samples) to outperform it on both syntactic and semantic transfer. The code for our experiments is publicly available.","PeriodicalId":382084,"journal":{"name":"North American Chapter of the Association for Computational Linguistics","volume":"14 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2022-05-12","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"1","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"North American Chapter of the Association for Computational Linguistics","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.48550/arXiv.2205.05943","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}

引用次数: 1

Abstract

We propose a generative model for text generation, which exhibits disentangled latent representations of syntax and semantics. Contrary to previous work, this model does not need syntactic information such as constituency parses, or semantic information such as paraphrase pairs. Our model relies solely on the inductive bias found in attention-based architectures such as Transformers. In the attention of Transformers, keys handle information selection while values specify what information is conveyed. Our model, dubbed QKVAE, uses Attention in its decoder to read latent variables where one latent variable infers keys while another infers values. We run experiments on latent representations and experiments on syntax/semantics transfer which show that QKVAE displays clear signs of disentangled syntax and semantics. We also show that our model displays competitive syntax transfer capabilities when compared to supervised models and that comparable supervised models need a fairly large amount of data (more than 50K samples) to outperform it on both syntactic and semantic transfer. The code for our experiments is publicly available.

查看原文本刊更多论文

利用变压器中的归纳偏置进行无监督的语义和语法解缠

我们提出了一种文本生成模型，它展示了语法和语义的分离的潜在表示。与以前的工作相反，该模型不需要语法信息(如选区解析)或语义信息(如释义对)。我们的模型完全依赖于在《变形金刚》等基于注意力的架构中发现的归纳偏差。在transformer中，键处理信息选择，而值指定要传递什么信息。我们的模型被称为QKVAE，它在解码器中使用注意力来读取潜在变量，其中一个潜在变量推断密钥，而另一个推断值。我们对潜在表征和语法/语义迁移进行了实验，结果表明QKVAE显示出明显的语法和语义分离的迹象。我们还表明，与监督模型相比，我们的模型显示出具有竞争力的语法迁移能力，并且可比的监督模型需要相当大的数据量(超过50K个样本)才能在语法和语义迁移方面优于它。我们实验的代码是公开的。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

求助全文

约1分钟内获得全文求助全文

来源期刊

North American Chapter of the Association for Computational Linguistics

自引率

0.00%

发文量