Unsupervised Text Style Transfer for Authorship Obfuscation in Bahasa Indonesia

Yunita Sari, Fadhlan Pasyah Al Faridzi
{"title":"Unsupervised Text Style Transfer for Authorship Obfuscation in Bahasa Indonesia","authors":"Yunita Sari, Fadhlan Pasyah Al Faridzi","doi":"10.22146/ijccs.79623","DOIUrl":null,"url":null,"abstract":"Authorship attribution is an NLP task to identify the author of a text based on stylometric analysis. On the other hand, authorship obfuscation aims to protect against authorship attribution by modifying a text’s style. The main challenge in authorship obfuscation is how to keep the content of the text despite the text modification. In this research, we are applying text style transfer methods for modifying the writing style while preserving the content of the input text. We implemented two unsupervised text style transfer: dictionary-based and back translation methods to change the formality level of the text. Experiment results shows that the back-translation method outperformed the dictionary-based method. The authorship attribution performance decreased up to 16.15% and 23.66% on F1-score for 3 and 10 authors respectively using back-translation. While for dictionary-based method the F1-score dropped up to 1.99% and 11.56% for 3 and 10 authors respectively. Evaluation on sensibleness and soundness factors show that the back-translation method can preserve the semantic of the obfuscated texts. Moreover, the modified texts are well-formed and inconspicuous.  ","PeriodicalId":31625,"journal":{"name":"IJCCS Indonesian Journal of Computing and Cybernetics Systems","volume":" ","pages":""},"PeriodicalIF":0.0000,"publicationDate":"2023-02-23","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"IJCCS Indonesian Journal of Computing and Cybernetics Systems","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.22146/ijccs.79623","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 0

Abstract

Authorship attribution is an NLP task to identify the author of a text based on stylometric analysis. On the other hand, authorship obfuscation aims to protect against authorship attribution by modifying a text’s style. The main challenge in authorship obfuscation is how to keep the content of the text despite the text modification. In this research, we are applying text style transfer methods for modifying the writing style while preserving the content of the input text. We implemented two unsupervised text style transfer: dictionary-based and back translation methods to change the formality level of the text. Experiment results shows that the back-translation method outperformed the dictionary-based method. The authorship attribution performance decreased up to 16.15% and 23.66% on F1-score for 3 and 10 authors respectively using back-translation. While for dictionary-based method the F1-score dropped up to 1.99% and 11.56% for 3 and 10 authors respectively. Evaluation on sensibleness and soundness factors show that the back-translation method can preserve the semantic of the obfuscated texts. Moreover, the modified texts are well-formed and inconspicuous.  
印尼巴哈萨语作者困惑的无监督文本风格转换
作者归属是一项基于文体分析来识别文本作者的NLP任务。另一方面,作者身份混淆旨在通过修改文本的样式来防止作者身份归属。作者身份混淆的主要挑战是如何在修改文本的情况下保持文本的内容。在本研究中,我们应用文本风格转移方法来修改写作风格,同时保留输入文本的内容。我们实现了两种无监督的文本风格转换:基于字典的和反向翻译的方法来改变文本的正式程度。实验结果表明,反翻译方法优于基于字典的方法。使用反译的3位作者和10位作者在f1分上分别下降了16.15%和23.66%。而在基于词典的方法中,3位作者和10位作者的f1得分分别下降到1.99%和11.56%。通过对敏感性和合理性因素的评价,表明反翻译方法可以保留混淆文本的语义。此外,修改后的文本格式良好,不显眼。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
求助全文
约1分钟内获得全文 求助全文
来源期刊
自引率
0.00%
发文量
20
审稿时长
12 weeks
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
确定
请完成安全验证×
copy
已复制链接
快去分享给好友吧!
我知道了
右上角分享
点击右上角分享
0
联系我们:info@booksci.cn Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。 Copyright © 2023 布克学术 All rights reserved.
京ICP备2023020795号-1
ghs 京公网安备 11010802042870号
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术官方微信