Syntax-controlled paraphrases generation with VAE and multi-task learning

IF 3.1 3区 计算机科学 Q2 COMPUTER SCIENCE, ARTIFICIAL INTELLIGENCE
Xiyuan Jia , Zongqing Mao , Zhen Zhang , Qiyun Lv , Xin Wang , Guohua Wu
{"title":"Syntax-controlled paraphrases generation with VAE and multi-task learning","authors":"Xiyuan Jia ,&nbsp;Zongqing Mao ,&nbsp;Zhen Zhang ,&nbsp;Qiyun Lv ,&nbsp;Xin Wang ,&nbsp;Guohua Wu","doi":"10.1016/j.csl.2024.101705","DOIUrl":null,"url":null,"abstract":"<div><p>Paraphrase generation is an important method for augmenting text data, which has a crucial role in Natural Language Generation (NLG). However, existing methods lack the ability to capture the semantic representation of input sentences and the syntactic structure of exemplars, which can easily lead to problems such as redundant content, semantic inaccuracies, and poor diversity. To tackle these challenges, we propose a Syntax-Controlled Paraphrase Generator (SCPG), which utilizes attention networks and VAE-based hidden variables to model the semantics of input sentences and the syntax of exemplars. In addition, in order to achieve controllability of the target paraphrase structure, we propose a method for learning semantic and syntactic representations based on multi-task learning, and successfully integrate the two through a gating mechanism. Extensive experimental results show that SCPG achieves SOTA results in terms of both semantic consistency and syntactic controllability, and is able to make a better trade-off between preserving semantics and novelty of sentence structure.</p></div>","PeriodicalId":50638,"journal":{"name":"Computer Speech and Language","volume":null,"pages":null},"PeriodicalIF":3.1000,"publicationDate":"2024-08-08","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.sciencedirect.com/science/article/pii/S0885230824000883/pdfft?md5=a172f9652be80ec2012b298f58353215&pid=1-s2.0-S0885230824000883-main.pdf","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Computer Speech and Language","FirstCategoryId":"94","ListUrlMain":"https://www.sciencedirect.com/science/article/pii/S0885230824000883","RegionNum":3,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q2","JCRName":"COMPUTER SCIENCE, ARTIFICIAL INTELLIGENCE","Score":null,"Total":0}
引用次数: 0

Abstract

Paraphrase generation is an important method for augmenting text data, which has a crucial role in Natural Language Generation (NLG). However, existing methods lack the ability to capture the semantic representation of input sentences and the syntactic structure of exemplars, which can easily lead to problems such as redundant content, semantic inaccuracies, and poor diversity. To tackle these challenges, we propose a Syntax-Controlled Paraphrase Generator (SCPG), which utilizes attention networks and VAE-based hidden variables to model the semantics of input sentences and the syntax of exemplars. In addition, in order to achieve controllability of the target paraphrase structure, we propose a method for learning semantic and syntactic representations based on multi-task learning, and successfully integrate the two through a gating mechanism. Extensive experimental results show that SCPG achieves SOTA results in terms of both semantic consistency and syntactic controllability, and is able to make a better trade-off between preserving semantics and novelty of sentence structure.

利用 VAE 和多任务学习生成受语法控制的转述
意译生成是增强文本数据的一种重要方法,在自然语言生成(NLG)中发挥着至关重要的作用。然而,现有的方法无法捕捉输入句子的语义表征和示例的句法结构,这很容易导致冗余内容、语义不准确和多样性差等问题。为了应对这些挑战,我们提出了语法控制仿句生成器(SCPG),它利用注意力网络和基于 VAE 的隐藏变量来模拟输入句子的语义和示例的语法。此外,为了实现目标转述结构的可控性,我们提出了一种基于多任务学习的语义和句法表征学习方法,并通过门控机制成功地将二者整合在一起。大量实验结果表明,SCPG 在语义一致性和句法可控性方面都达到了 SOTA 的结果,并能在保留语义和句子结构新颖性之间做出更好的权衡。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
求助全文
约1分钟内获得全文 求助全文
来源期刊
Computer Speech and Language
Computer Speech and Language 工程技术-计算机:人工智能
CiteScore
11.30
自引率
4.70%
发文量
80
审稿时长
22.9 weeks
期刊介绍: Computer Speech & Language publishes reports of original research related to the recognition, understanding, production, coding and mining of speech and language. The speech and language sciences have a long history, but it is only relatively recently that large-scale implementation of and experimentation with complex models of speech and language processing has become feasible. Such research is often carried out somewhat separately by practitioners of artificial intelligence, computer science, electronic engineering, information retrieval, linguistics, phonetics, or psychology.
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
确定
请完成安全验证×
copy
已复制链接
快去分享给好友吧!
我知道了
右上角分享
点击右上角分享
0
联系我们:info@booksci.cn Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。 Copyright © 2023 布克学术 All rights reserved.
京ICP备2023020795号-1
ghs 京公网安备 11010802042870号
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术官方微信