Moreno La Quatra, Giuseppe Gallipoli, Luca Cagliero
{"title":"Self-supervised Text Style Transfer using Cycle-Consistent Adversarial Networks","authors":"Moreno La Quatra, Giuseppe Gallipoli, Luca Cagliero","doi":"10.1145/3678179","DOIUrl":null,"url":null,"abstract":"Text Style Transfer (TST) is a relevant branch of natural language processing that aims to control the style attributes of a piece of text while preserving its original content. To address TST in the absence of parallel data, Cycle-consistent Generative Adversarial Networks (CycleGANs) have recently emerged as promising solutions. Existing CycleGAN-based TST approaches suffer from the following limitations: (1) They apply self-supervision, based on the cycle-consistency principle, in the latent space. This approach turns out to be less robust to mixed-style inputs, i.e., when the source text is partly in the original and partly in the target style; (2) Generators and discriminators rely on recurrent networks, which are exposed to known issues with long-term text dependencies; (3) The target style is weakly enforced, as the discriminator distinguishes real from fake sentences without explicitly accounting for the generated text’s style. We propose a new CycleGAN-based TST approach that applies self-supervision directly at the sequence level to effectively handle mixed-style inputs and employs Transformers to leverage the attention mechanism for both text encoding and decoding. We also employ a pre-trained style classifier to guide the generation of text in the target style while maintaining the original content’s meaning. The experimental results achieved on the formality and sentiment transfer tasks show that our approach outperforms existing ones, both CycleGAN-based and not (including an open-source Large Language Model), on benchmark data and shows better robustness to mixed-style inputs.","PeriodicalId":48967,"journal":{"name":"ACM Transactions on Intelligent Systems and Technology","volume":null,"pages":null},"PeriodicalIF":7.2000,"publicationDate":"2024-07-18","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"ACM Transactions on Intelligent Systems and Technology","FirstCategoryId":"94","ListUrlMain":"https://doi.org/10.1145/3678179","RegionNum":4,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q1","JCRName":"COMPUTER SCIENCE, ARTIFICIAL INTELLIGENCE","Score":null,"Total":0}
引用次数: 0
Abstract
Text Style Transfer (TST) is a relevant branch of natural language processing that aims to control the style attributes of a piece of text while preserving its original content. To address TST in the absence of parallel data, Cycle-consistent Generative Adversarial Networks (CycleGANs) have recently emerged as promising solutions. Existing CycleGAN-based TST approaches suffer from the following limitations: (1) They apply self-supervision, based on the cycle-consistency principle, in the latent space. This approach turns out to be less robust to mixed-style inputs, i.e., when the source text is partly in the original and partly in the target style; (2) Generators and discriminators rely on recurrent networks, which are exposed to known issues with long-term text dependencies; (3) The target style is weakly enforced, as the discriminator distinguishes real from fake sentences without explicitly accounting for the generated text’s style. We propose a new CycleGAN-based TST approach that applies self-supervision directly at the sequence level to effectively handle mixed-style inputs and employs Transformers to leverage the attention mechanism for both text encoding and decoding. We also employ a pre-trained style classifier to guide the generation of text in the target style while maintaining the original content’s meaning. The experimental results achieved on the formality and sentiment transfer tasks show that our approach outperforms existing ones, both CycleGAN-based and not (including an open-source Large Language Model), on benchmark data and shows better robustness to mixed-style inputs.
期刊介绍:
ACM Transactions on Intelligent Systems and Technology is a scholarly journal that publishes the highest quality papers on intelligent systems, applicable algorithms and technology with a multi-disciplinary perspective. An intelligent system is one that uses artificial intelligence (AI) techniques to offer important services (e.g., as a component of a larger system) to allow integrated systems to perceive, reason, learn, and act intelligently in the real world.
ACM TIST is published quarterly (six issues a year). Each issue has 8-11 regular papers, with around 20 published journal pages or 10,000 words per paper. Additional references, proofs, graphs or detailed experiment results can be submitted as a separate appendix, while excessively lengthy papers will be rejected automatically. Authors can include online-only appendices for additional content of their published papers and are encouraged to share their code and/or data with other readers.