利用循环一致性对抗网络实现自监督文本风格转移

IF 4.7 2区化学 Q2 MATERIALS SCIENCE, MULTIDISCIPLINARY

ACS Applied Polymer Materials Pub Date : 2024-07-18 DOI:10.1145/3678179

Moreno La Quatra, Giuseppe Gallipoli, Luca Cagliero

{"title":"利用循环一致性对抗网络实现自监督文本风格转移","authors":"Moreno La Quatra, Giuseppe Gallipoli, Luca Cagliero","doi":"10.1145/3678179","DOIUrl":null,"url":null,"abstract":"Text Style Transfer (TST) is a relevant branch of natural language processing that aims to control the style attributes of a piece of text while preserving its original content. To address TST in the absence of parallel data, Cycle-consistent Generative Adversarial Networks (CycleGANs) have recently emerged as promising solutions. Existing CycleGAN-based TST approaches suffer from the following limitations: (1) They apply self-supervision, based on the cycle-consistency principle, in the latent space. This approach turns out to be less robust to mixed-style inputs, i.e., when the source text is partly in the original and partly in the target style; (2) Generators and discriminators rely on recurrent networks, which are exposed to known issues with long-term text dependencies; (3) The target style is weakly enforced, as the discriminator distinguishes real from fake sentences without explicitly accounting for the generated text’s style. We propose a new CycleGAN-based TST approach that applies self-supervision directly at the sequence level to effectively handle mixed-style inputs and employs Transformers to leverage the attention mechanism for both text encoding and decoding. We also employ a pre-trained style classifier to guide the generation of text in the target style while maintaining the original content’s meaning. The experimental results achieved on the formality and sentiment transfer tasks show that our approach outperforms existing ones, both CycleGAN-based and not (including an open-source Large Language Model), on benchmark data and shows better robustness to mixed-style inputs.","PeriodicalId":7,"journal":{"name":"ACS Applied Polymer Materials","volume":" 28","pages":""},"PeriodicalIF":4.7000,"publicationDate":"2024-07-18","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":"{\"title\":\"Self-supervised Text Style Transfer using Cycle-Consistent Adversarial Networks\",\"authors\":\"Moreno La Quatra, Giuseppe Gallipoli, Luca Cagliero\",\"doi\":\"10.1145/3678179\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"Text Style Transfer (TST) is a relevant branch of natural language processing that aims to control the style attributes of a piece of text while preserving its original content. To address TST in the absence of parallel data, Cycle-consistent Generative Adversarial Networks (CycleGANs) have recently emerged as promising solutions. Existing CycleGAN-based TST approaches suffer from the following limitations: (1) They apply self-supervision, based on the cycle-consistency principle, in the latent space. This approach turns out to be less robust to mixed-style inputs, i.e., when the source text is partly in the original and partly in the target style; (2) Generators and discriminators rely on recurrent networks, which are exposed to known issues with long-term text dependencies; (3) The target style is weakly enforced, as the discriminator distinguishes real from fake sentences without explicitly accounting for the generated text’s style. We propose a new CycleGAN-based TST approach that applies self-supervision directly at the sequence level to effectively handle mixed-style inputs and employs Transformers to leverage the attention mechanism for both text encoding and decoding. We also employ a pre-trained style classifier to guide the generation of text in the target style while maintaining the original content’s meaning. The experimental results achieved on the formality and sentiment transfer tasks show that our approach outperforms existing ones, both CycleGAN-based and not (including an open-source Large Language Model), on benchmark data and shows better robustness to mixed-style inputs.\",\"PeriodicalId\":7,\"journal\":{\"name\":\"ACS Applied Polymer Materials\",\"volume\":\" 28\",\"pages\":\"\"},\"PeriodicalIF\":4.7000,\"publicationDate\":\"2024-07-18\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"0\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"ACS Applied Polymer Materials\",\"FirstCategoryId\":\"94\",\"ListUrlMain\":\"https://doi.org/10.1145/3678179\",\"RegionNum\":2,\"RegionCategory\":\"化学\",\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"Q2\",\"JCRName\":\"MATERIALS SCIENCE, MULTIDISCIPLINARY\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"ACS Applied Polymer Materials","FirstCategoryId":"94","ListUrlMain":"https://doi.org/10.1145/3678179","RegionNum":2,"RegionCategory":"化学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q2","JCRName":"MATERIALS SCIENCE, MULTIDISCIPLINARY","Score":null,"Total":0}

引用次数: 0

摘要

文本风格转换（TST）是自然语言处理的一个相关分支，旨在控制一段文本的风格属性，同时保留其原始内容。为了在没有并行数据的情况下解决 TST 问题，循环一致性生成对抗网络（Cycle-consistent Generative Adversarial Networks，CycleGANs）最近成为一种很有前途的解决方案。现有的基于 CycleGAN 的 TST 方法存在以下局限性：(1) 它们根据循环一致性原则在潜在空间中应用自我监督。这种方法对混合风格输入的鲁棒性较差，即源文本部分为原始风格，部分为目标风格时；(2) 生成器和判别器依赖于递归网络，而递归网络存在已知的长期文本依赖性问题；(3) 目标风格执行不力，因为判别器在区分真假句子时没有明确考虑生成文本的风格。我们提出了一种新的基于 CycleGAN 的 TST 方法，该方法直接在序列级别应用自监督，以有效处理混合风格输入，并使用 Transformers 利用注意力机制进行文本编码和解码。我们还采用了一个预先训练好的文体分类器，以指导生成目标文体的文本，同时保持原始内容的含义。在格式和情感转换任务上取得的实验结果表明，我们的方法在基准数据上优于现有的基于 CycleGAN 的方法和非基于 CycleGAN 的方法（包括开源的大型语言模型），并对混合风格输入表现出更好的鲁棒性。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

查看原文本刊更多论文

Self-supervised Text Style Transfer using Cycle-Consistent Adversarial Networks

Text Style Transfer (TST) is a relevant branch of natural language processing that aims to control the style attributes of a piece of text while preserving its original content. To address TST in the absence of parallel data, Cycle-consistent Generative Adversarial Networks (CycleGANs) have recently emerged as promising solutions. Existing CycleGAN-based TST approaches suffer from the following limitations: (1) They apply self-supervision, based on the cycle-consistency principle, in the latent space. This approach turns out to be less robust to mixed-style inputs, i.e., when the source text is partly in the original and partly in the target style; (2) Generators and discriminators rely on recurrent networks, which are exposed to known issues with long-term text dependencies; (3) The target style is weakly enforced, as the discriminator distinguishes real from fake sentences without explicitly accounting for the generated text’s style. We propose a new CycleGAN-based TST approach that applies self-supervision directly at the sequence level to effectively handle mixed-style inputs and employs Transformers to leverage the attention mechanism for both text encoding and decoding. We also employ a pre-trained style classifier to guide the generation of text in the target style while maintaining the original content’s meaning. The experimental results achieved on the formality and sentiment transfer tasks show that our approach outperforms existing ones, both CycleGAN-based and not (including an open-source Large Language Model), on benchmark data and shows better robustness to mixed-style inputs.

求助全文

通过发布文献求助，成功后即可免费获取论文全文。去求助

来源期刊

ACS Applied Polymer Materials Multiple-

CiteScore

7.20

自引率

6.00%

发文量

810

期刊介绍： ACS Applied Polymer Materials is an interdisciplinary journal publishing original research covering all aspects of engineering, chemistry, physics, and biology relevant to applications of polymers. The journal is devoted to reports of new and original experimental and theoretical research of an applied nature that integrates fundamental knowledge in the areas of materials, engineering, physics, bioscience, polymer science and chemistry into important polymer applications. The journal is specifically interested in work that addresses relationships among structure, processing, morphology, chemistry, properties, and function as well as work that provide insights into mechanisms critical to the performance of the polymer for applications.