DeepWise Cyber Teen Guardian: Protecting Internet Environment via a Novel Automatic Adversarial Samples Detection System Based on Revised Neural Network
{"title":"DeepWise Cyber Teen Guardian: Protecting Internet Environment via a Novel Automatic Adversarial Samples Detection System Based on Revised Neural Network","authors":"Yi-Hsien Lin","doi":"10.1109/EPCE58798.2023.00022","DOIUrl":null,"url":null,"abstract":"In order to assist teenagers to surf the internet healthily, there are image filters based on image recognition neural networks removing inappropriate content, but image filters are vulnerable to attack by adversarial samples, which are generated by adding well-crafted noise to an image, making the filter consider an inappropriate image as appropriate. In order to defend against adversarial attacks, various adversarial detection algorithms are designed. The state-of-the-art detector enjoys a promising detection performance, but it suffers from high computational overhead. In this work, DeepWise is proposed, which records the mean and covariance for each class of images at each recording layer of the neural network during training. The input image’s mean and covariance at each layer and the network’s classification of the image are also recorded. Then, the Mahalanobis distance is calculated between the distributions of input image and the training images at each layer based on the network’s prediction. A linear regressor takes these distances as input, and determines whether the input image is an adversarial sample or not. DeepWise achieves a computational saving ranging from 39.77% to 87.98% for datasets SVHN, CIFAR- 10, and CIFAR-100 on the image recognition model ResNet-34 and DenseNet-3 while preserving a comparable AUROC to the existing state-of-the-art method.","PeriodicalId":355442,"journal":{"name":"2023 2nd Asia Conference on Electrical, Power and Computer Engineering (EPCE)","volume":"1 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2023-04-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"2023 2nd Asia Conference on Electrical, Power and Computer Engineering (EPCE)","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/EPCE58798.2023.00022","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 0
Abstract
In order to assist teenagers to surf the internet healthily, there are image filters based on image recognition neural networks removing inappropriate content, but image filters are vulnerable to attack by adversarial samples, which are generated by adding well-crafted noise to an image, making the filter consider an inappropriate image as appropriate. In order to defend against adversarial attacks, various adversarial detection algorithms are designed. The state-of-the-art detector enjoys a promising detection performance, but it suffers from high computational overhead. In this work, DeepWise is proposed, which records the mean and covariance for each class of images at each recording layer of the neural network during training. The input image’s mean and covariance at each layer and the network’s classification of the image are also recorded. Then, the Mahalanobis distance is calculated between the distributions of input image and the training images at each layer based on the network’s prediction. A linear regressor takes these distances as input, and determines whether the input image is an adversarial sample or not. DeepWise achieves a computational saving ranging from 39.77% to 87.98% for datasets SVHN, CIFAR- 10, and CIFAR-100 on the image recognition model ResNet-34 and DenseNet-3 while preserving a comparable AUROC to the existing state-of-the-art method.