Label-Only Membership Inference Attack Based on Model Explanation

IF 2.8 4区计算机科学 Q3 COMPUTER SCIENCE, ARTIFICIAL INTELLIGENCE

Neural Processing Letters Pub Date : 2024-09-18 DOI:10.1007/s11063-024-11682-1

Yao Ma, Xurong Zhai, Dan Yu, Yuli Yang, Xingyu Wei, Yongle Chen

{"title":"Label-Only Membership Inference Attack Based on Model Explanation","authors":"Yao Ma, Xurong Zhai, Dan Yu, Yuli Yang, Xingyu Wei, Yongle Chen","doi":"10.1007/s11063-024-11682-1","DOIUrl":null,"url":null,"abstract":"<p>It is well known that machine learning models (e.g., image recognition) can unintentionally leak information about the training set. Conventional membership inference relies on posterior vectors, and this task becomes extremely difficult when the posterior is masked. However, current label-only membership inference attacks require a large number of queries during the generation of adversarial samples, and thus incorrect inference generates a large number of invalid queries. Therefore, we introduce a label-only membership inference attack based on model explanations. It can transform a label-only attack into a traditional membership inference attack by observing neighborhood consistency and perform fine-grained membership inference for vulnerable samples. We use feature attribution to simplify the high-dimensional neighborhood sampling process, quickly identify decision boundaries and recover a posteriori vectors. It also compares different privacy risks faced by different samples through finding vulnerable samples. The method is validated on CIFAR-10, CIFAR-100 and MNIST datasets. The results show that membership attributes can be identified even using a simple sampling method. Furthermore, vulnerable samples expose the model to greater privacy risks.</p>","PeriodicalId":51144,"journal":{"name":"Neural Processing Letters","volume":"21 1","pages":""},"PeriodicalIF":2.8000,"publicationDate":"2024-09-18","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Neural Processing Letters","FirstCategoryId":"94","ListUrlMain":"https://doi.org/10.1007/s11063-024-11682-1","RegionNum":4,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q3","JCRName":"COMPUTER SCIENCE, ARTIFICIAL INTELLIGENCE","Score":null,"Total":0}

引用次数: 0

Abstract

It is well known that machine learning models (e.g., image recognition) can unintentionally leak information about the training set. Conventional membership inference relies on posterior vectors, and this task becomes extremely difficult when the posterior is masked. However, current label-only membership inference attacks require a large number of queries during the generation of adversarial samples, and thus incorrect inference generates a large number of invalid queries. Therefore, we introduce a label-only membership inference attack based on model explanations. It can transform a label-only attack into a traditional membership inference attack by observing neighborhood consistency and perform fine-grained membership inference for vulnerable samples. We use feature attribution to simplify the high-dimensional neighborhood sampling process, quickly identify decision boundaries and recover a posteriori vectors. It also compares different privacy risks faced by different samples through finding vulnerable samples. The method is validated on CIFAR-10, CIFAR-100 and MNIST datasets. The results show that membership attributes can be identified even using a simple sampling method. Furthermore, vulnerable samples expose the model to greater privacy risks.

Abstract Image

查看原文本刊更多论文

基于模型解释的仅标签成员推理攻击

众所周知，机器学习模型（如图像识别）会无意中泄露训练集的信息。传统的成员推断依赖于后验向量，当后验向量被掩盖时，这项任务就变得异常困难。然而，目前的纯标签成员推断攻击在生成对抗样本时需要大量查询，因此错误的推断会产生大量无效查询。因此，我们引入了一种基于模型解释的纯标签成员推理攻击。它可以通过观察邻域一致性将纯标签攻击转化为传统的成员推断攻击，并对脆弱样本执行细粒度成员推断。我们利用特征归因来简化高维邻域采样过程，快速识别决策边界并恢复后验向量。它还通过寻找易受攻击样本，比较不同样本面临的不同隐私风险。该方法在 CIFAR-10、CIFAR-100 和 MNIST 数据集上进行了验证。结果表明，即使使用简单的抽样方法，也能识别成员属性。此外，脆弱样本会使模型面临更大的隐私风险。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

求助全文

约1分钟内获得全文求助全文

来源期刊

Neural Processing Letters 工程技术-计算机：人工智能

CiteScore

4.90

自引率

12.90%

发文量

392

审稿时长

2.8 months

期刊介绍： Neural Processing Letters is an international journal publishing research results and innovative ideas on all aspects of artificial neural networks. Coverage includes theoretical developments, biological models, new formal modes, learning, applications, software and hardware developments, and prospective researches. The journal promotes fast exchange of information in the community of neural network researchers and users. The resurgence of interest in the field of artificial neural networks since the beginning of the 1980s is coupled to tremendous research activity in specialized or multidisciplinary groups. Research, however, is not possible without good communication between people and the exchange of information, especially in a field covering such different areas; fast communication is also a key aspect, and this is the reason for Neural Processing Letters