利用欺骗性摄动防御神经网络模型窃取攻击

2019 IEEE Security and Privacy Workshops (SPW) Pub Date : 2019-05-19 DOI:10.1109/SPW.2019.00020

Taesung Lee, Ben Edwards, Ian Molloy, D. Su

{"title":"利用欺骗性摄动防御神经网络模型窃取攻击","authors":"Taesung Lee, Ben Edwards, Ian Molloy, D. Su","doi":"10.1109/SPW.2019.00020","DOIUrl":null,"url":null,"abstract":"Machine learning architectures are readily available, but obtaining the high quality labeled data for training is costly. Pre-trained models available as cloud services can be used to generate this costly labeled data, and would allow an attacker to replicate trained models, effectively stealing them. Limiting the information provided by cloud based models by omitting class probabilities has been proposed as a means of protection but significantly impacts the utility of the models. In this work, we illustrate how cloud based models can still provide useful class probability information for users, while significantly limiting the ability of an adversary to steal the model. Our defense perturbs the model's final activation layer, slightly altering the output probabilities. This forces the adversary to discard the class probabilities, requiring significantly more queries before they can train a model with comparable performance. We evaluate our defense under diverse scenarios and defense aware attacks. Our evaluation shows our defense can degrade the accuracy of the stolen model at least 20%, or increase the number of queries required by an adversary 64 fold, all with a negligible decrease in the protected model accuracy.","PeriodicalId":125351,"journal":{"name":"2019 IEEE Security and Privacy Workshops (SPW)","volume":"128 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2019-05-19","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"64","resultStr":"{\"title\":\"Defending Against Neural Network Model Stealing Attacks Using Deceptive Perturbations\",\"authors\":\"Taesung Lee, Ben Edwards, Ian Molloy, D. Su\",\"doi\":\"10.1109/SPW.2019.00020\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"Machine learning architectures are readily available, but obtaining the high quality labeled data for training is costly. Pre-trained models available as cloud services can be used to generate this costly labeled data, and would allow an attacker to replicate trained models, effectively stealing them. Limiting the information provided by cloud based models by omitting class probabilities has been proposed as a means of protection but significantly impacts the utility of the models. In this work, we illustrate how cloud based models can still provide useful class probability information for users, while significantly limiting the ability of an adversary to steal the model. Our defense perturbs the model's final activation layer, slightly altering the output probabilities. This forces the adversary to discard the class probabilities, requiring significantly more queries before they can train a model with comparable performance. We evaluate our defense under diverse scenarios and defense aware attacks. Our evaluation shows our defense can degrade the accuracy of the stolen model at least 20%, or increase the number of queries required by an adversary 64 fold, all with a negligible decrease in the protected model accuracy.\",\"PeriodicalId\":125351,\"journal\":{\"name\":\"2019 IEEE Security and Privacy Workshops (SPW)\",\"volume\":\"128 1\",\"pages\":\"0\"},\"PeriodicalIF\":0.0000,\"publicationDate\":\"2019-05-19\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"64\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"2019 IEEE Security and Privacy Workshops (SPW)\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/10.1109/SPW.2019.00020\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"2019 IEEE Security and Privacy Workshops (SPW)","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/SPW.2019.00020","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}

引用次数: 64

摘要

机器学习架构很容易获得，但是获得高质量的标记数据用于训练是昂贵的。作为云服务可用的预训练模型可用于生成这种昂贵的标记数据，并且允许攻击者复制训练过的模型，有效地窃取它们。通过省略类别概率来限制基于云的模型提供的信息已被提议作为一种保护手段，但会严重影响模型的效用。在这项工作中，我们说明了基于云的模型如何仍然可以为用户提供有用的类概率信息，同时显着限制对手窃取模型的能力。我们的防御干扰了模型的最终激活层，稍微改变了输出概率。这迫使对手放弃类概率，在训练具有可比性能的模型之前需要进行更多的查询。我们在不同的场景和防御意识攻击下评估我们的防御。我们的评估表明，我们的防御可以将被盗模型的准确性降低至少20%，或者将对手所需的查询数量增加64倍，所有这些都可以忽略受保护模型准确性的降低。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

查看原文本刊更多论文

Defending Against Neural Network Model Stealing Attacks Using Deceptive Perturbations

Machine learning architectures are readily available, but obtaining the high quality labeled data for training is costly. Pre-trained models available as cloud services can be used to generate this costly labeled data, and would allow an attacker to replicate trained models, effectively stealing them. Limiting the information provided by cloud based models by omitting class probabilities has been proposed as a means of protection but significantly impacts the utility of the models. In this work, we illustrate how cloud based models can still provide useful class probability information for users, while significantly limiting the ability of an adversary to steal the model. Our defense perturbs the model's final activation layer, slightly altering the output probabilities. This forces the adversary to discard the class probabilities, requiring significantly more queries before they can train a model with comparable performance. We evaluate our defense under diverse scenarios and defense aware attacks. Our evaluation shows our defense can degrade the accuracy of the stolen model at least 20%, or increase the number of queries required by an adversary 64 fold, all with a negligible decrease in the protected model accuracy.

求助全文

通过发布文献求助，成功后即可免费获取论文全文。去求助

来源期刊

2019 IEEE Security and Privacy Workshops (SPW)

自引率

0.00%

发文量