{"title":"A hypothetical defenses-based training framework for generating transferable adversarial examples","authors":"Lingguang Hao , Kuangrong Hao , Yaochu Jin , Hongzhi Zhao","doi":"10.1016/j.knosys.2024.112602","DOIUrl":null,"url":null,"abstract":"<div><div>Transfer-based attacks utilize the proxy model to craft adversarial examples against the target model and make significant advancements in the realm of black-box attacks. Recent research suggests that these attacks can be enhanced by incorporating adversarial defenses into the training process of adversarial examples. Specifically, adversarial defenses supervise the training process, forcing the attacker to overcome greater challenges and produce more robust adversarial examples with enhanced transferability. However, current methods mainly rely on limited input transformation defenses, which apply only linear affine changes. These defenses are insufficient for effectively removing harmful content from adversarial examples, resulting in restricted improvements in their transferability. To address this issue, we propose a novel training framework named Transfer-based Attacks through Hypothesis Defense (TA-HD). This framework enhances the generalization of adversarial examples by integrating a hypothesis defense mechanism into the proxy model. Specifically, we propose an input denoising network as the hypothesis defense to effectively remove harmful noise from adversarial examples. Furthermore, we introduce an adversarial training strategy and design specific adversarial loss functions to optimize the input denoising network’s parameters. The visualization of the training process demonstrates the effective denoising capability of the hypothesized defense mechanism and the stability of the training process. Extensive experiments show that the proposed training framework significantly improves the success rate of transfer-based attacks by up to 19.9%. The code is available at <span><span>https://github.com/haolingguang/TA-HD</span><svg><path></path></svg></span>.</div></div>","PeriodicalId":49939,"journal":{"name":"Knowledge-Based Systems","volume":null,"pages":null},"PeriodicalIF":7.2000,"publicationDate":"2024-10-08","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Knowledge-Based Systems","FirstCategoryId":"94","ListUrlMain":"https://www.sciencedirect.com/science/article/pii/S095070512401236X","RegionNum":1,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q1","JCRName":"COMPUTER SCIENCE, ARTIFICIAL INTELLIGENCE","Score":null,"Total":0}
引用次数: 0
Abstract
Transfer-based attacks utilize the proxy model to craft adversarial examples against the target model and make significant advancements in the realm of black-box attacks. Recent research suggests that these attacks can be enhanced by incorporating adversarial defenses into the training process of adversarial examples. Specifically, adversarial defenses supervise the training process, forcing the attacker to overcome greater challenges and produce more robust adversarial examples with enhanced transferability. However, current methods mainly rely on limited input transformation defenses, which apply only linear affine changes. These defenses are insufficient for effectively removing harmful content from adversarial examples, resulting in restricted improvements in their transferability. To address this issue, we propose a novel training framework named Transfer-based Attacks through Hypothesis Defense (TA-HD). This framework enhances the generalization of adversarial examples by integrating a hypothesis defense mechanism into the proxy model. Specifically, we propose an input denoising network as the hypothesis defense to effectively remove harmful noise from adversarial examples. Furthermore, we introduce an adversarial training strategy and design specific adversarial loss functions to optimize the input denoising network’s parameters. The visualization of the training process demonstrates the effective denoising capability of the hypothesized defense mechanism and the stability of the training process. Extensive experiments show that the proposed training framework significantly improves the success rate of transfer-based attacks by up to 19.9%. The code is available at https://github.com/haolingguang/TA-HD.
期刊介绍:
Knowledge-Based Systems, an international and interdisciplinary journal in artificial intelligence, publishes original, innovative, and creative research results in the field. It focuses on knowledge-based and other artificial intelligence techniques-based systems. The journal aims to support human prediction and decision-making through data science and computation techniques, provide a balanced coverage of theory and practical study, and encourage the development and implementation of knowledge-based intelligence models, methods, systems, and software tools. Applications in business, government, education, engineering, and healthcare are emphasized.