通过向不同的买家分发不同的副本来减轻对抗性攻击

Proceedings of the 2023 ACM Asia Conference on Computer and Communications Security Pub Date : 2021-11-30 DOI:10.1145/3579856.3582824

Jiyi Zhang, Hansheng Fang, W. Tann, Ke Xu, Chengfang Fang, E. Chang

{"title":"通过向不同的买家分发不同的副本来减轻对抗性攻击","authors":"Jiyi Zhang, Hansheng Fang, W. Tann, Ke Xu, Chengfang Fang, E. Chang","doi":"10.1145/3579856.3582824","DOIUrl":null,"url":null,"abstract":"Machine learning models are vulnerable to adversarial attacks. In this paper, we consider the scenario where a model is distributed to multiple buyers, among which a malicious buyer attempts to attack another buyer. The malicious buyer probes its copy of the model to search for adversarial samples and then presents the found samples to the victim’s copy of the model in order to replicate the attack. We point out that by distributing different copies of the model to different buyers, we can mitigate the attack such that adversarial samples found on one copy would not work on another copy. We observed that training a model with different randomness indeed mitigates such replication to a certain degree. However, there is no guarantee and retraining is computationally expensive. A number of works extended the retraining method to enhance the differences among models. However, a very limited number of models can be produced using such methods and the computational cost becomes even higher. Therefore, we propose a flexible parameter rewriting method that directly modifies the model’s parameters. This method does not require additional training and is able to generate a large number of copies in a more controllable manner, where each copy induces different adversarial regions. Experimentation studies show that rewriting can significantly mitigate the attacks while retaining high classification accuracy. For instance, on GTSRB dataset with respect to Hop Skip Jump attack, using attractor-based rewriter can reduce the success rate of replicating the attack to 0.5% while independently training copies with different randomness can reduce the success rate to 6.5%. From this study, we believe that there are many further directions worth exploring.","PeriodicalId":156082,"journal":{"name":"Proceedings of the 2023 ACM Asia Conference on Computer and Communications Security","volume":"17 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2021-11-30","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"3","resultStr":"{\"title\":\"Mitigating Adversarial Attacks by Distributing Different Copies to Different Buyers\",\"authors\":\"Jiyi Zhang, Hansheng Fang, W. Tann, Ke Xu, Chengfang Fang, E. Chang\",\"doi\":\"10.1145/3579856.3582824\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"Machine learning models are vulnerable to adversarial attacks. In this paper, we consider the scenario where a model is distributed to multiple buyers, among which a malicious buyer attempts to attack another buyer. The malicious buyer probes its copy of the model to search for adversarial samples and then presents the found samples to the victim’s copy of the model in order to replicate the attack. We point out that by distributing different copies of the model to different buyers, we can mitigate the attack such that adversarial samples found on one copy would not work on another copy. We observed that training a model with different randomness indeed mitigates such replication to a certain degree. However, there is no guarantee and retraining is computationally expensive. A number of works extended the retraining method to enhance the differences among models. However, a very limited number of models can be produced using such methods and the computational cost becomes even higher. Therefore, we propose a flexible parameter rewriting method that directly modifies the model’s parameters. This method does not require additional training and is able to generate a large number of copies in a more controllable manner, where each copy induces different adversarial regions. Experimentation studies show that rewriting can significantly mitigate the attacks while retaining high classification accuracy. For instance, on GTSRB dataset with respect to Hop Skip Jump attack, using attractor-based rewriter can reduce the success rate of replicating the attack to 0.5% while independently training copies with different randomness can reduce the success rate to 6.5%. From this study, we believe that there are many further directions worth exploring.\",\"PeriodicalId\":156082,\"journal\":{\"name\":\"Proceedings of the 2023 ACM Asia Conference on Computer and Communications Security\",\"volume\":\"17 1\",\"pages\":\"0\"},\"PeriodicalIF\":0.0000,\"publicationDate\":\"2021-11-30\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"3\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"Proceedings of the 2023 ACM Asia Conference on Computer and Communications Security\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/10.1145/3579856.3582824\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"Proceedings of the 2023 ACM Asia Conference on Computer and Communications Security","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1145/3579856.3582824","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}

引用次数: 3

摘要

机器学习模型容易受到对抗性攻击。在本文中，我们考虑一个模型被分发给多个买家的场景，其中一个恶意买家试图攻击另一个买家。恶意买家通过探测其模型副本来搜索对抗样本，然后将找到的样本呈现给受害者的模型副本，以复制攻击。我们指出，通过将模型的不同副本分发给不同的买家，我们可以减轻攻击，这样在一个副本上发现的对抗性样本在另一个副本上就不起作用了。我们观察到，训练具有不同随机性的模型确实在一定程度上减轻了这种复制。然而，这并不能保证再培训在计算上是昂贵的。许多工作扩展了再训练方法，以增强模型之间的差异。然而，使用这种方法可以产生的模型数量非常有限，并且计算成本变得更高。因此，我们提出了一种灵活的参数重写方法，直接修改模型的参数。这种方法不需要额外的训练，并且能够以一种更可控的方式生成大量的副本，其中每个副本诱导不同的对抗区域。实验研究表明，在保持较高分类准确率的同时，改写算法可以显著减轻攻击。例如，在GTSRB数据集上，对于Hop Skip Jump攻击，使用基于吸引子的重写器可以将复制攻击的成功率降低到0.5%，而独立训练不同随机性的副本可以将成功率降低到6.5%。从这项研究中，我们认为有许多进一步的方向值得探索。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

查看原文本刊更多论文

Mitigating Adversarial Attacks by Distributing Different Copies to Different Buyers

Machine learning models are vulnerable to adversarial attacks. In this paper, we consider the scenario where a model is distributed to multiple buyers, among which a malicious buyer attempts to attack another buyer. The malicious buyer probes its copy of the model to search for adversarial samples and then presents the found samples to the victim’s copy of the model in order to replicate the attack. We point out that by distributing different copies of the model to different buyers, we can mitigate the attack such that adversarial samples found on one copy would not work on another copy. We observed that training a model with different randomness indeed mitigates such replication to a certain degree. However, there is no guarantee and retraining is computationally expensive. A number of works extended the retraining method to enhance the differences among models. However, a very limited number of models can be produced using such methods and the computational cost becomes even higher. Therefore, we propose a flexible parameter rewriting method that directly modifies the model’s parameters. This method does not require additional training and is able to generate a large number of copies in a more controllable manner, where each copy induces different adversarial regions. Experimentation studies show that rewriting can significantly mitigate the attacks while retaining high classification accuracy. For instance, on GTSRB dataset with respect to Hop Skip Jump attack, using attractor-based rewriter can reduce the success rate of replicating the attack to 0.5% while independently training copies with different randomness can reduce the success rate to 6.5%. From this study, we believe that there are many further directions worth exploring.

求助全文

通过发布文献求助，成功后即可免费获取论文全文。去求助

来源期刊

Proceedings of the 2023 ACM Asia Conference on Computer and Communications Security

自引率

0.00%

发文量