Man Hu , Yatao Yang , Deng Pan , Zhongliang Guo , Luwei Xiao , Deyu Lin , Shuai Zhao
{"title":"Syntactic paraphrase-based synthetic data generation for backdoor attacks against Chinese language models","authors":"Man Hu , Yatao Yang , Deng Pan , Zhongliang Guo , Luwei Xiao , Deyu Lin , Shuai Zhao","doi":"10.1016/j.inffus.2025.103376","DOIUrl":null,"url":null,"abstract":"<div><div>Language Models (LMs) have shown significant advancements in various Natural Language Processing (NLP) tasks. However, recent studies indicate that LMs are particularly susceptible to malicious backdoor attacks, where attackers manipulate the models to exhibit specific behaviors when they encounter particular triggers. While existing research has focused on backdoor attacks against English LMs, Chinese LMs remain largely unexplored. Moreover, existing backdoor attacks against Chinese LMs exhibit limited stealthiness. In this paper, we investigate the high detectability of current backdoor attacks against Chinese LMs and propose a more stealthy backdoor attack method based on syntactic paraphrasing. Specifically, we leverage large language models (LLMs) to construct a syntactic paraphrasing mechanism that transforms benign inputs into poisoned samples with predefined syntactic structures. Subsequently, we exploit the syntactic structures of these poisoned samples as triggers to create more stealthy and robust backdoor attacks across various attack strategies. Extensive experiments conducted on three major NLP tasks with various Chinese PLMs and LLMs demonstrate that our method can achieve comparable attack performance (almost 100% success rate). Additionally, the poisoned samples generated by our method show lower perplexity and fewer grammatical errors compared to traditional character-level backdoor attacks. Furthermore, our method exhibits strong resistance against two state-of-the-art backdoor defense mechanisms.</div></div>","PeriodicalId":50367,"journal":{"name":"Information Fusion","volume":"124 ","pages":"Article 103376"},"PeriodicalIF":14.7000,"publicationDate":"2025-06-12","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Information Fusion","FirstCategoryId":"94","ListUrlMain":"https://www.sciencedirect.com/science/article/pii/S156625352500449X","RegionNum":1,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q1","JCRName":"COMPUTER SCIENCE, ARTIFICIAL INTELLIGENCE","Score":null,"Total":0}
引用次数: 0
Abstract
Language Models (LMs) have shown significant advancements in various Natural Language Processing (NLP) tasks. However, recent studies indicate that LMs are particularly susceptible to malicious backdoor attacks, where attackers manipulate the models to exhibit specific behaviors when they encounter particular triggers. While existing research has focused on backdoor attacks against English LMs, Chinese LMs remain largely unexplored. Moreover, existing backdoor attacks against Chinese LMs exhibit limited stealthiness. In this paper, we investigate the high detectability of current backdoor attacks against Chinese LMs and propose a more stealthy backdoor attack method based on syntactic paraphrasing. Specifically, we leverage large language models (LLMs) to construct a syntactic paraphrasing mechanism that transforms benign inputs into poisoned samples with predefined syntactic structures. Subsequently, we exploit the syntactic structures of these poisoned samples as triggers to create more stealthy and robust backdoor attacks across various attack strategies. Extensive experiments conducted on three major NLP tasks with various Chinese PLMs and LLMs demonstrate that our method can achieve comparable attack performance (almost 100% success rate). Additionally, the poisoned samples generated by our method show lower perplexity and fewer grammatical errors compared to traditional character-level backdoor attacks. Furthermore, our method exhibits strong resistance against two state-of-the-art backdoor defense mechanisms.
期刊介绍:
Information Fusion serves as a central platform for showcasing advancements in multi-sensor, multi-source, multi-process information fusion, fostering collaboration among diverse disciplines driving its progress. It is the leading outlet for sharing research and development in this field, focusing on architectures, algorithms, and applications. Papers dealing with fundamental theoretical analyses as well as those demonstrating their application to real-world problems will be welcome.