面向真实暴露的对抗示例生成的改进两阶段生成对抗网络

Q3 Computer Science

Recent Advances in Computer Science and Communications Pub Date : 2023-06-08 DOI:10.2174/2666255816666230608104148

Priyanka Goyal, D. Singh

{"title":"面向真实暴露的对抗示例生成的改进两阶段生成对抗网络","authors":"Priyanka Goyal, D. Singh","doi":"10.2174/2666255816666230608104148","DOIUrl":null,"url":null,"abstract":"\n\nDeep neural networks due to their linear nature are sensitive to adversarial examples. They can easily be broken just by a small disturbance to the input data. Some of the existing methods to perform these kinds of attacks are pixel-level perturbation and spatial transformation of images.\n\n\n\nThese methods generate adversarial examples that can be fed to the network for wrong predictions. The drawback that comes with these methods is that they are really slow and computationally expensive. This research work performed a black box attack on the target model classifier by using the generative adversarial networks (GAN) to generate adversarial examples that can fool a classifier model to classify the images as wrong classes. The proposed method used a biased dataset that does not contain any data of the target label to train the first generator Gnorm of the first stage GAN, and after the first training has finished, the second stage generator Gadv, which is a new generator model that does not take random noise as input but the output of the first generator Gnorm.\n\n\n\nThe generated examples have been superimposed with the Gnorm output with a small constant, and then the superimposed data have been fed to the target model classifier to calculate the loss. Some additional losses have been included to constrain the generation from generating target examples.\n\n\n\nThe proposed model has shown a better fidelity score, as evaluated using Fretchet inception distance score (FID), which was up to 42.43 in the first stage and up to 105.65 in the second stage with the attack success rate of up to 99.13%.\n","PeriodicalId":36514,"journal":{"name":"Recent Advances in Computer Science and Communications","volume":" ","pages":""},"PeriodicalIF":0.0000,"publicationDate":"2023-06-08","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":"{\"title\":\"Improved Two Stage Generative Adversarial Networks for Adversarial Example Generation with Real Exposure\",\"authors\":\"Priyanka Goyal, D. Singh\",\"doi\":\"10.2174/2666255816666230608104148\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"\\n\\nDeep neural networks due to their linear nature are sensitive to adversarial examples. They can easily be broken just by a small disturbance to the input data. Some of the existing methods to perform these kinds of attacks are pixel-level perturbation and spatial transformation of images.\\n\\n\\n\\nThese methods generate adversarial examples that can be fed to the network for wrong predictions. The drawback that comes with these methods is that they are really slow and computationally expensive. This research work performed a black box attack on the target model classifier by using the generative adversarial networks (GAN) to generate adversarial examples that can fool a classifier model to classify the images as wrong classes. The proposed method used a biased dataset that does not contain any data of the target label to train the first generator Gnorm of the first stage GAN, and after the first training has finished, the second stage generator Gadv, which is a new generator model that does not take random noise as input but the output of the first generator Gnorm.\\n\\n\\n\\nThe generated examples have been superimposed with the Gnorm output with a small constant, and then the superimposed data have been fed to the target model classifier to calculate the loss. Some additional losses have been included to constrain the generation from generating target examples.\\n\\n\\n\\nThe proposed model has shown a better fidelity score, as evaluated using Fretchet inception distance score (FID), which was up to 42.43 in the first stage and up to 105.65 in the second stage with the attack success rate of up to 99.13%.\\n\",\"PeriodicalId\":36514,\"journal\":{\"name\":\"Recent Advances in Computer Science and Communications\",\"volume\":\" \",\"pages\":\"\"},\"PeriodicalIF\":0.0000,\"publicationDate\":\"2023-06-08\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"0\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"Recent Advances in Computer Science and Communications\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/10.2174/2666255816666230608104148\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"Q3\",\"JCRName\":\"Computer Science\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"Recent Advances in Computer Science and Communications","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.2174/2666255816666230608104148","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q3","JCRName":"Computer Science","Score":null,"Total":0}

引用次数: 0

摘要

深度神经网络由于其线性性质，对对抗性示例很敏感。它们很容易被输入数据的微小干扰破坏。执行这类攻击的一些现有方法是像素级扰动和图像的空间变换。这些方法生成对抗性示例，这些示例可以被提供给网络以进行错误的预测。这些方法的缺点是速度慢，计算成本高。本研究工作通过使用生成对抗性网络（GAN）生成对抗性示例来欺骗分类器模型将图像分类为错误的类别，从而对目标模型分类器进行了黑箱攻击。所提出的方法使用不包含目标标签的任何数据的有偏数据集来训练第一级GAN的第一生成器Gnorm，并在第一次训练完成后，训练第二级生成器Gadv，这是一种新的生成器模型，不将随机噪声作为输入，而是将第一生成器Gnrm的输出。将生成的示例与具有小常数的Gnorm输出进行叠加，然后将叠加的数据馈送到目标模型分类器以计算损失。已经包括了一些额外的损失，以限制生成目标示例。使用Fretchet起始距离得分（FID）评估，所提出的模型显示出更好的保真度得分，第一阶段高达42.43，第二阶段高达105.65，攻击成功率高达99.13%。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

查看原文本刊更多论文

Improved Two Stage Generative Adversarial Networks for Adversarial Example Generation with Real Exposure

Deep neural networks due to their linear nature are sensitive to adversarial examples. They can easily be broken just by a small disturbance to the input data. Some of the existing methods to perform these kinds of attacks are pixel-level perturbation and spatial transformation of images. These methods generate adversarial examples that can be fed to the network for wrong predictions. The drawback that comes with these methods is that they are really slow and computationally expensive. This research work performed a black box attack on the target model classifier by using the generative adversarial networks (GAN) to generate adversarial examples that can fool a classifier model to classify the images as wrong classes. The proposed method used a biased dataset that does not contain any data of the target label to train the first generator Gnorm of the first stage GAN, and after the first training has finished, the second stage generator Gadv, which is a new generator model that does not take random noise as input but the output of the first generator Gnorm. The generated examples have been superimposed with the Gnorm output with a small constant, and then the superimposed data have been fed to the target model classifier to calculate the loss. Some additional losses have been included to constrain the generation from generating target examples. The proposed model has shown a better fidelity score, as evaluated using Fretchet inception distance score (FID), which was up to 42.43 in the first stage and up to 105.65 in the second stage with the attack success rate of up to 99.13%.

求助全文

通过发布文献求助，成功后即可免费获取论文全文。去求助

来源期刊

Recent Advances in Computer Science and Communications Computer Science-Computer Science (all)

CiteScore

2.50

自引率

0.00%

发文量

142