Stealing Machine Learning Models: Attacks and Countermeasures for Generative Adversarial Networks

Annual Computer Security Applications Conference Pub Date : 2021-12-06 DOI:10.1145/3485832.3485838

Hailong Hu, Jun Pang

引用次数: 13

Abstract

Model extraction attacks aim to duplicate a machine learning model through query access to a target model. Early studies mainly focus on discriminative models. Despite the success, model extraction attacks against generative models are less well explored. In this paper, we systematically study the feasibility of model extraction attacks against generative adversarial networks (GANs). Specifically, we first define fidelity and accuracy on model extraction attacks against GANs. Then we study model extraction attacks against GANs from the perspective of fidelity extraction and accuracy extraction, according to the adversary’s goals and background knowledge. We further conduct a case study where the adversary can transfer knowledge of the extracted model which steals a state-of-the-art GAN trained with more than 3 million images to new domains to broaden the scope of applications of model extraction attacks. Finally, we propose effective defense techniques to safeguard GANs, considering a trade-off between the utility and security of GAN models.

查看原文本刊更多论文

窃取机器学习模型:生成对抗网络的攻击和对策

模型提取攻击的目的是通过对目标模型的查询访问来复制机器学习模型。早期的研究主要集中在判别模型上。尽管取得了成功，但针对生成模型的模型提取攻击还没有得到很好的探索。本文系统地研究了针对生成式对抗网络(GANs)的模型提取攻击的可行性。具体来说，我们首先定义了针对gan的模型提取攻击的保真度和准确性。然后根据对手的目标和背景知识，从保真度提取和精度提取的角度研究了针对gan的模型提取攻击。我们进一步进行了一个案例研究，在这个案例中，攻击者可以将提取模型的知识转移到新的领域，该模型窃取了一个经过300多万张图像训练的最先进的GAN，以扩大模型提取攻击的应用范围。最后，我们提出了有效的防御技术来保护GAN，考虑到GAN模型的实用性和安全性之间的权衡。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

求助全文

约1分钟内获得全文求助全文

来源期刊

Annual Computer Security Applications Conference

自引率

0.00%

发文量