Kenta Tsunomori, Yuma Yamasaki, M. Kuribayashi, N. Funabiki, I. Echizen
{"title":"Detection and Correction of Adversarial Examples Based on IPEG-Compression-Derived Distortion","authors":"Kenta Tsunomori, Yuma Yamasaki, M. Kuribayashi, N. Funabiki, I. Echizen","doi":"10.23919/APSIPAASC55919.2022.9980147","DOIUrl":null,"url":null,"abstract":"An effective way to defend against adversarial examples (AEs), which are used, for example, to attack applications such as face recognition, is to detect in advance whether an input image is an AE. Some AE defense methods focus on the response characteristics of image classifiers when denoising filters are applied to the input image. However, several filters are required, which results in a large amount of computation. Because JPEG compression of AEs effectively removes adversarial perturbations, the difference between the image before and after JPEG compression should be highly correlated with the perturbations. However, the difference should not be completely consistent with adversarial perturbations. We have developed a filtering operation that modulates this difference by varying their magnitude and positive/negative sign and adding them to an image so that adversarial perturbations can be effectively removed. We consider that adversarial perturbations that could not be removed by simply applying JPEG compression can be removed by modulating this difference. Furthermore, applying a resizing process to the image after adding these distortions enables us to remove perturbations that could not be removed otherwise. The filtering operation will successfully remove the adversarial noise and reconstruct the corrected samples from AEs. We also consider a simple but effective reconstruction method based on the filtering operations. Experiments in which the adversarial attack used was not known to the detector demonstrated that the proposed method could achieve better performance in terms of accuracy with reasonable computational complexity. In addition, the percentage of correct classification results after applying the proposed filter for non-targeted attacks was higher than that of JPEG compression and scaling. These results suggest that the proposed method effectively removes adversarial perturbations and is an effective filter for detecting AEs.","PeriodicalId":382967,"journal":{"name":"2022 Asia-Pacific Signal and Information Processing Association Annual Summit and Conference (APSIPA ASC)","volume":"43 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2022-11-07","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"2022 Asia-Pacific Signal and Information Processing Association Annual Summit and Conference (APSIPA ASC)","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.23919/APSIPAASC55919.2022.9980147","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 0
Abstract
An effective way to defend against adversarial examples (AEs), which are used, for example, to attack applications such as face recognition, is to detect in advance whether an input image is an AE. Some AE defense methods focus on the response characteristics of image classifiers when denoising filters are applied to the input image. However, several filters are required, which results in a large amount of computation. Because JPEG compression of AEs effectively removes adversarial perturbations, the difference between the image before and after JPEG compression should be highly correlated with the perturbations. However, the difference should not be completely consistent with adversarial perturbations. We have developed a filtering operation that modulates this difference by varying their magnitude and positive/negative sign and adding them to an image so that adversarial perturbations can be effectively removed. We consider that adversarial perturbations that could not be removed by simply applying JPEG compression can be removed by modulating this difference. Furthermore, applying a resizing process to the image after adding these distortions enables us to remove perturbations that could not be removed otherwise. The filtering operation will successfully remove the adversarial noise and reconstruct the corrected samples from AEs. We also consider a simple but effective reconstruction method based on the filtering operations. Experiments in which the adversarial attack used was not known to the detector demonstrated that the proposed method could achieve better performance in terms of accuracy with reasonable computational complexity. In addition, the percentage of correct classification results after applying the proposed filter for non-targeted attacks was higher than that of JPEG compression and scaling. These results suggest that the proposed method effectively removes adversarial perturbations and is an effective filter for detecting AEs.