{"title":"Unicode Evil: Evading NLP Systems Using Visual Similarities of Text Characters","authors":"A. Dionysiou, E. Athanasopoulos","doi":"10.1145/3474369.3486871","DOIUrl":null,"url":null,"abstract":"Adversarial Text Generation Frameworks (ATGFs) aim at causing a Natural Language Processing (NLP) machine to misbehave, i.e., misclassify a given input. In this paper, we propose EvilText, a general ATGF that successfully evades some of the most popular NLP machines by (efficiently) perturbing a given legitimate text, preserving at the same time the original text's semantics as well as human readability. Perturbations are based on visually similar classes of characters appearing in the unicode set. EvilText can be utilized from NLP services' operators for evaluating their systems security and robustness. Furthermore, EvilText outperforms the state-of-the-art ATGFs, in terms of: (a) effectiveness, (b) efficiency and (c) original text's semantics and human readability preservation. We evaluate EvilText on some of the most popular NLP systems used for sentiment analysis and toxic content detection. We further expand on the generality and transferability of our ATGF, while also exploring possible countermeasures for defending against our attacks. Surprisingly, naive defence mechanisms fail to mitigate our attacks; the only promising one being the restriction of unicode characters use. However, we argue that restricting the use of unicode characters imposes a significant trade-off between security and usability as almost all websites are heavily based on unicode support.","PeriodicalId":411057,"journal":{"name":"Proceedings of the 14th ACM Workshop on Artificial Intelligence and Security","volume":"78 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2021-11-15","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"5","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Proceedings of the 14th ACM Workshop on Artificial Intelligence and Security","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1145/3474369.3486871","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 5
Abstract
Adversarial Text Generation Frameworks (ATGFs) aim at causing a Natural Language Processing (NLP) machine to misbehave, i.e., misclassify a given input. In this paper, we propose EvilText, a general ATGF that successfully evades some of the most popular NLP machines by (efficiently) perturbing a given legitimate text, preserving at the same time the original text's semantics as well as human readability. Perturbations are based on visually similar classes of characters appearing in the unicode set. EvilText can be utilized from NLP services' operators for evaluating their systems security and robustness. Furthermore, EvilText outperforms the state-of-the-art ATGFs, in terms of: (a) effectiveness, (b) efficiency and (c) original text's semantics and human readability preservation. We evaluate EvilText on some of the most popular NLP systems used for sentiment analysis and toxic content detection. We further expand on the generality and transferability of our ATGF, while also exploring possible countermeasures for defending against our attacks. Surprisingly, naive defence mechanisms fail to mitigate our attacks; the only promising one being the restriction of unicode characters use. However, we argue that restricting the use of unicode characters imposes a significant trade-off between security and usability as almost all websites are heavily based on unicode support.