EG-Booster:解释引导的ML逃避攻击助推器

Proceedings of the Twelfth ACM Conference on Data and Application Security and Privacy Pub Date : 2021-08-31 DOI:10.1145/3508398.3511510

Abderrahmen Amich, Birhanu Eshete

{"title":"EG-Booster:解释引导的ML逃避攻击助推器","authors":"Abderrahmen Amich, Birhanu Eshete","doi":"10.1145/3508398.3511510","DOIUrl":null,"url":null,"abstract":"The widespread usage of machine learning (ML) in a myriad of domains has raised questions about its trustworthiness in high-stakes environments. Part of the quest for trustworthy ML is assessing robustness to test-time adversarial examples. Inline with the trustworthy ML goal, a useful input to potentially aid robustness evaluation is feature-based explanations of model predictions. In this paper, we present a novel approach, called EG-Booster, that leverages techniques from explainable ML to guide adversarial example crafting for improved robustness evaluation of ML models. The key insight in EG-Booster is the use of feature-based explanations of model predictions to guide adversarial example crafting by adding consequential perturbations (likely to result in model evasion) and avoiding non-consequential perturbations (unlikely to contribute to evasion). EG-Booster is agnostic to model architecture, threat model, and supports diverse distance metrics used in the literature. We evaluate EG-Booster using image classification benchmark datasets: MNIST and CIFAR10. Our findings suggest that EG-Booster significantly improves the evasion rate of state-of-the-art attacks while performing a smaller number of perturbations. Through extensive experiments that cover four white-box and three black-box attacks, we demonstrate the effectiveness of EG-Booster against two undefended neural networks trained on MNIST and CIFAR10, and an adversarially-trained ResNet model trained on CIFAR10. Furthermore, we introduce a stability assessment metric and evaluate the reliability of our explanation-based attack boosting approach by tracking the similarity between the model's predictions across multiple runs of EG-Booster. Our results over 10 separate runs suggest that EG-Booster's output is stable across distinct runs. Combined with state-of-the-art attacks, we hope EG-Booster will be used towards improved robustness assessment of ML models against evasion attacks.","PeriodicalId":102306,"journal":{"name":"Proceedings of the Twelfth ACM Conference on Data and Application Security and Privacy","volume":"14 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2021-08-31","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"1","resultStr":"{\"title\":\"EG-Booster: Explanation-Guided Booster of ML Evasion Attacks\",\"authors\":\"Abderrahmen Amich, Birhanu Eshete\",\"doi\":\"10.1145/3508398.3511510\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"The widespread usage of machine learning (ML) in a myriad of domains has raised questions about its trustworthiness in high-stakes environments. Part of the quest for trustworthy ML is assessing robustness to test-time adversarial examples. Inline with the trustworthy ML goal, a useful input to potentially aid robustness evaluation is feature-based explanations of model predictions. In this paper, we present a novel approach, called EG-Booster, that leverages techniques from explainable ML to guide adversarial example crafting for improved robustness evaluation of ML models. The key insight in EG-Booster is the use of feature-based explanations of model predictions to guide adversarial example crafting by adding consequential perturbations (likely to result in model evasion) and avoiding non-consequential perturbations (unlikely to contribute to evasion). EG-Booster is agnostic to model architecture, threat model, and supports diverse distance metrics used in the literature. We evaluate EG-Booster using image classification benchmark datasets: MNIST and CIFAR10. Our findings suggest that EG-Booster significantly improves the evasion rate of state-of-the-art attacks while performing a smaller number of perturbations. Through extensive experiments that cover four white-box and three black-box attacks, we demonstrate the effectiveness of EG-Booster against two undefended neural networks trained on MNIST and CIFAR10, and an adversarially-trained ResNet model trained on CIFAR10. Furthermore, we introduce a stability assessment metric and evaluate the reliability of our explanation-based attack boosting approach by tracking the similarity between the model's predictions across multiple runs of EG-Booster. Our results over 10 separate runs suggest that EG-Booster's output is stable across distinct runs. Combined with state-of-the-art attacks, we hope EG-Booster will be used towards improved robustness assessment of ML models against evasion attacks.\",\"PeriodicalId\":102306,\"journal\":{\"name\":\"Proceedings of the Twelfth ACM Conference on Data and Application Security and Privacy\",\"volume\":\"14 1\",\"pages\":\"0\"},\"PeriodicalIF\":0.0000,\"publicationDate\":\"2021-08-31\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"1\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"Proceedings of the Twelfth ACM Conference on Data and Application Security and Privacy\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/10.1145/3508398.3511510\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"Proceedings of the Twelfth ACM Conference on Data and Application Security and Privacy","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1145/3508398.3511510","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}

引用次数: 1

摘要

机器学习(ML)在众多领域的广泛应用引发了人们对其在高风险环境中的可信度的质疑。对值得信赖的机器学习的部分追求是评估对测试时对抗示例的鲁棒性。与值得信赖的机器学习目标一致，一个有助于鲁棒性评估的有用输入是对模型预测的基于特征的解释。在本文中，我们提出了一种新的方法，称为EG-Booster，它利用可解释ML的技术来指导对抗性示例制作，以改进ML模型的鲁棒性评估。egg - booster的关键观点是使用基于特征的模型预测解释，通过添加相应的扰动(可能导致模型逃避)和避免非相应的扰动(不太可能导致逃避)来指导对抗性示例制作。egg - booster与模型架构、威胁模型无关，并支持文献中使用的各种距离度量。我们使用图像分类基准数据集:MNIST和CIFAR10来评估egg - booster。我们的研究结果表明，egg - booster显着提高了最先进攻击的逃避率，同时执行较少数量的扰动。通过覆盖4个白盒攻击和3个黑盒攻击的广泛实验，我们证明了egg - booster针对两个在MNIST和CIFAR10上训练的无防御神经网络，以及在CIFAR10上训练的对抗训练的ResNet模型的有效性。此外，我们引入了一个稳定性评估指标，并通过跟踪多个EG-Booster运行期间模型预测之间的相似性来评估基于解释的攻击增强方法的可靠性。我们在10次单独运行的结果表明，egg - booster的输出在不同的运行中是稳定的。结合最先进的攻击，我们希望egg - booster将用于改进ML模型对逃避攻击的鲁棒性评估。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

查看原文本刊更多论文

EG-Booster: Explanation-Guided Booster of ML Evasion Attacks

The widespread usage of machine learning (ML) in a myriad of domains has raised questions about its trustworthiness in high-stakes environments. Part of the quest for trustworthy ML is assessing robustness to test-time adversarial examples. Inline with the trustworthy ML goal, a useful input to potentially aid robustness evaluation is feature-based explanations of model predictions. In this paper, we present a novel approach, called EG-Booster, that leverages techniques from explainable ML to guide adversarial example crafting for improved robustness evaluation of ML models. The key insight in EG-Booster is the use of feature-based explanations of model predictions to guide adversarial example crafting by adding consequential perturbations (likely to result in model evasion) and avoiding non-consequential perturbations (unlikely to contribute to evasion). EG-Booster is agnostic to model architecture, threat model, and supports diverse distance metrics used in the literature. We evaluate EG-Booster using image classification benchmark datasets: MNIST and CIFAR10. Our findings suggest that EG-Booster significantly improves the evasion rate of state-of-the-art attacks while performing a smaller number of perturbations. Through extensive experiments that cover four white-box and three black-box attacks, we demonstrate the effectiveness of EG-Booster against two undefended neural networks trained on MNIST and CIFAR10, and an adversarially-trained ResNet model trained on CIFAR10. Furthermore, we introduce a stability assessment metric and evaluate the reliability of our explanation-based attack boosting approach by tracking the similarity between the model's predictions across multiple runs of EG-Booster. Our results over 10 separate runs suggest that EG-Booster's output is stable across distinct runs. Combined with state-of-the-art attacks, we hope EG-Booster will be used towards improved robustness assessment of ML models against evasion attacks.

求助全文

通过发布文献求助，成功后即可免费获取论文全文。去求助

来源期刊

Proceedings of the Twelfth ACM Conference on Data and Application Security and Privacy

自引率

0.00%

发文量