作为潜在序列的语言:半监督转述生成的深层潜在变量模型

Jialin Yu , Alexandra I. Cristea , Anoushka Harit , Zhongtian Sun , Olanrewaju Tahir Aduragba , Lei Shi , Noura Al Moubayed
{"title":"作为潜在序列的语言:半监督转述生成的深层潜在变量模型","authors":"Jialin Yu ,&nbsp;Alexandra I. Cristea ,&nbsp;Anoushka Harit ,&nbsp;Zhongtian Sun ,&nbsp;Olanrewaju Tahir Aduragba ,&nbsp;Lei Shi ,&nbsp;Noura Al Moubayed","doi":"10.1016/j.aiopen.2023.05.001","DOIUrl":null,"url":null,"abstract":"<div><p>This paper explores deep latent variable models for semi-supervised paraphrase generation, where the missing target pair for unlabelled data is modelled as a latent paraphrase sequence. We present a novel unsupervised model named <em>variational sequence auto-encoding reconstruction</em> (<strong>VSAR</strong>), which performs latent sequence inference given an observed text. To leverage information from text pairs, we additionally introduce a novel supervised model we call <em>dual directional learning</em> (<strong>DDL</strong>), which is designed to integrate with our proposed VSAR model. Combining VSAR with DDL (<strong>DDL+VSAR</strong>) enables us to conduct semi-supervised learning. Still, the combined model suffers from a cold-start problem. To further combat this issue, we propose an improved weight initialisation solution, leading to a novel two-stage training scheme we call <em>knowledge-reinforced-learning</em> (<strong>KRL</strong>). Our empirical evaluations suggest that the combined model yields competitive performance against the state-of-the-art supervised baselines on complete data. Furthermore, in scenarios where only a fraction of the labelled pairs are available, our combined model consistently outperforms the strong supervised model baseline (<strong>DDL</strong>) by a significant margin (<span><math><mrow><mi>p</mi><mo>&lt;</mo><mo>.</mo><mn>05</mn></mrow></math></span>; Wilcoxon test). Our code is publicly available at https://github.com/jialin-yu/latent-sequence-paraphrase.</p></div>","PeriodicalId":100068,"journal":{"name":"AI Open","volume":"4 ","pages":"Pages 19-32"},"PeriodicalIF":0.0000,"publicationDate":"2023-01-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":"{\"title\":\"Language as a latent sequence: Deep latent variable models for semi-supervised paraphrase generation\",\"authors\":\"Jialin Yu ,&nbsp;Alexandra I. Cristea ,&nbsp;Anoushka Harit ,&nbsp;Zhongtian Sun ,&nbsp;Olanrewaju Tahir Aduragba ,&nbsp;Lei Shi ,&nbsp;Noura Al Moubayed\",\"doi\":\"10.1016/j.aiopen.2023.05.001\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"<div><p>This paper explores deep latent variable models for semi-supervised paraphrase generation, where the missing target pair for unlabelled data is modelled as a latent paraphrase sequence. We present a novel unsupervised model named <em>variational sequence auto-encoding reconstruction</em> (<strong>VSAR</strong>), which performs latent sequence inference given an observed text. To leverage information from text pairs, we additionally introduce a novel supervised model we call <em>dual directional learning</em> (<strong>DDL</strong>), which is designed to integrate with our proposed VSAR model. Combining VSAR with DDL (<strong>DDL+VSAR</strong>) enables us to conduct semi-supervised learning. Still, the combined model suffers from a cold-start problem. To further combat this issue, we propose an improved weight initialisation solution, leading to a novel two-stage training scheme we call <em>knowledge-reinforced-learning</em> (<strong>KRL</strong>). Our empirical evaluations suggest that the combined model yields competitive performance against the state-of-the-art supervised baselines on complete data. Furthermore, in scenarios where only a fraction of the labelled pairs are available, our combined model consistently outperforms the strong supervised model baseline (<strong>DDL</strong>) by a significant margin (<span><math><mrow><mi>p</mi><mo>&lt;</mo><mo>.</mo><mn>05</mn></mrow></math></span>; Wilcoxon test). Our code is publicly available at https://github.com/jialin-yu/latent-sequence-paraphrase.</p></div>\",\"PeriodicalId\":100068,\"journal\":{\"name\":\"AI Open\",\"volume\":\"4 \",\"pages\":\"Pages 19-32\"},\"PeriodicalIF\":0.0000,\"publicationDate\":\"2023-01-01\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"0\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"AI Open\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://www.sciencedirect.com/science/article/pii/S2666651023000025\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"AI Open","FirstCategoryId":"1085","ListUrlMain":"https://www.sciencedirect.com/science/article/pii/S2666651023000025","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 0

摘要

本文探讨了半监督转述生成的深层潜变量模型,其中未标记数据的缺失目标对被建模为潜转述序列。我们提出了一种新的无监督模型,称为变分序列自动编码重建(VSAR),该模型在给定观测文本的情况下执行潜在序列推理。为了利用来自文本对的信息,我们还引入了一种新的监督模型,称为双向学习(DDL),该模型旨在与我们提出的VSAR模型集成。将VSAR与DDL相结合(DDL+VSAR)使我们能够进行半监督学习。尽管如此,合并后的车型仍存在冷启动问题。为了进一步解决这个问题,我们提出了一种改进的权重初始化解决方案,从而产生了一种新的两阶段训练方案,我们称之为知识强化学习(KRL)。我们的经验评估表明,在完整数据上,与最先进的监督基线相比,组合模型产生了具有竞争力的性能。此外,在只有一小部分标记对可用的情况下,我们的组合模型始终显著优于强监督模型基线(DDL)(p<;.05;Wilcoxon检验)。我们的代码可在https://github.com/jialin-yu/latent-sequence-paraphrase.
本文章由计算机程序翻译,如有差异,请以英文原文为准。
Language as a latent sequence: Deep latent variable models for semi-supervised paraphrase generation

This paper explores deep latent variable models for semi-supervised paraphrase generation, where the missing target pair for unlabelled data is modelled as a latent paraphrase sequence. We present a novel unsupervised model named variational sequence auto-encoding reconstruction (VSAR), which performs latent sequence inference given an observed text. To leverage information from text pairs, we additionally introduce a novel supervised model we call dual directional learning (DDL), which is designed to integrate with our proposed VSAR model. Combining VSAR with DDL (DDL+VSAR) enables us to conduct semi-supervised learning. Still, the combined model suffers from a cold-start problem. To further combat this issue, we propose an improved weight initialisation solution, leading to a novel two-stage training scheme we call knowledge-reinforced-learning (KRL). Our empirical evaluations suggest that the combined model yields competitive performance against the state-of-the-art supervised baselines on complete data. Furthermore, in scenarios where only a fraction of the labelled pairs are available, our combined model consistently outperforms the strong supervised model baseline (DDL) by a significant margin (p<.05; Wilcoxon test). Our code is publicly available at https://github.com/jialin-yu/latent-sequence-paraphrase.

求助全文
通过发布文献求助,成功后即可免费获取论文全文。 去求助
来源期刊
CiteScore
45.00
自引率
0.00%
发文量
0
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
确定
请完成安全验证×
copy
已复制链接
快去分享给好友吧!
我知道了
右上角分享
点击右上角分享
0
联系我们:info@booksci.cn Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。 Copyright © 2023 布克学术 All rights reserved.
京ICP备2023020795号-1
ghs 京公网安备 11010802042870号
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术官方微信