QUDA: Query-Limited Data-Free Model Extraction

Proceedings of the 2023 ACM Asia Conference on Computer and Communications Security Pub Date : 2023-07-10 DOI:10.1145/3579856.3590336

Zijun Lin, Ke Xu, Chengfang Fang, Huadi Zheng, Aneez Ahmed Jaheezuddin, Jie Shi

{"title":"QUDA: Query-Limited Data-Free Model Extraction","authors":"Zijun Lin, Ke Xu, Chengfang Fang, Huadi Zheng, Aneez Ahmed Jaheezuddin, Jie Shi","doi":"10.1145/3579856.3590336","DOIUrl":null,"url":null,"abstract":"Model extraction attack typically refers to extracting non-public information from a black-box machine learning model. Its unauthorized nature poses significant threat to intellectual property rights of the model owners. By using the well-designed queries and the predictions returned from the victim model, the adversary is able to train a clone model from scratch to obtain similar functionality as victim model. Recently, some methods have been proposed to perform model extraction attacks without using any in-distribution data (Data-free setting). Although these methods have been shown to achieve high clone accuracy, their query budgets are typically around 10 million or even exceed 20 million in some datasets, which lead to a high cost of model stealing and can be easily defended by limiting the number of queries. To illustrate the severe threats induced by model extraction attacks with limited query budget in realistic scenarios, we propose QUDA – a novel QUey-limited DAta-free model extraction attack that incorporates GAN pre-trained by public unrelated dataset to provide weak image prior and the technique of deep reinforcement learning to make query generation strategy more efficient. Compared with the state-of-the-art data-free model extraction method, QUDA achieves better results under query-limited condition (0.1M query budget) in FMNIST and CIFAR-10 datasets, and even outperforms the baseline method in most cases when QUDA uses only 10% query budget of its. QUDA issued a warning that solely relying on the limited numbers of queries or the confidentiality of training data is not reliable to protect model’s security and privacy. Potential countermeasures, such as detection-based defense approach, are also provided.","PeriodicalId":156082,"journal":{"name":"Proceedings of the 2023 ACM Asia Conference on Computer and Communications Security","volume":"1 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2023-07-10","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Proceedings of the 2023 ACM Asia Conference on Computer and Communications Security","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1145/3579856.3590336","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}

引用次数: 0

Abstract

Model extraction attack typically refers to extracting non-public information from a black-box machine learning model. Its unauthorized nature poses significant threat to intellectual property rights of the model owners. By using the well-designed queries and the predictions returned from the victim model, the adversary is able to train a clone model from scratch to obtain similar functionality as victim model. Recently, some methods have been proposed to perform model extraction attacks without using any in-distribution data (Data-free setting). Although these methods have been shown to achieve high clone accuracy, their query budgets are typically around 10 million or even exceed 20 million in some datasets, which lead to a high cost of model stealing and can be easily defended by limiting the number of queries. To illustrate the severe threats induced by model extraction attacks with limited query budget in realistic scenarios, we propose QUDA – a novel QUey-limited DAta-free model extraction attack that incorporates GAN pre-trained by public unrelated dataset to provide weak image prior and the technique of deep reinforcement learning to make query generation strategy more efficient. Compared with the state-of-the-art data-free model extraction method, QUDA achieves better results under query-limited condition (0.1M query budget) in FMNIST and CIFAR-10 datasets, and even outperforms the baseline method in most cases when QUDA uses only 10% query budget of its. QUDA issued a warning that solely relying on the limited numbers of queries or the confidentiality of training data is not reliable to protect model’s security and privacy. Potential countermeasures, such as detection-based defense approach, are also provided.

查看原文本刊更多论文

QUDA:查询限制的无数据模型提取

模型提取攻击通常是指从黑箱机器学习模型中提取非公开信息。其未经授权的性质对模型所有者的知识产权构成了重大威胁。通过使用设计良好的查询和从受害者模型返回的预测，攻击者能够从头开始训练克隆模型，以获得与受害者模型相似的功能。最近，人们提出了一些不使用分布内数据(无数据设置)的模型提取攻击方法。虽然这些方法已经被证明可以达到很高的克隆精度，但它们的查询预算通常在1000万左右，甚至在某些数据集中超过2000万，这导致模型窃取的成本很高，可以通过限制查询数量来很容易地进行防御。为了说明在现实场景中查询预算有限的模型提取攻击所带来的严重威胁，我们提出了QUDA——一种新颖的队列限制无数据模型提取攻击，它结合了由公共不相关数据集预训练的GAN来提供弱图像先验和深度强化学习技术来提高查询生成策略的效率。与最先进的无数据模型提取方法相比，在FMNIST和CIFAR-10数据集上，QUDA在查询限制条件下(0.1M查询预算)取得了更好的结果，在大多数情况下，QUDA仅使用其10%的查询预算，甚至优于基线方法。QUDA发出警告，仅仅依靠有限的查询数量或训练数据的保密性是不可靠的，不能保护模型的安全和隐私。还提供了潜在的对策，例如基于探测的防御方法。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

求助全文

约1分钟内获得全文求助全文

来源期刊

Proceedings of the 2023 ACM Asia Conference on Computer and Communications Security

自引率

0.00%

发文量