基于模型解释的仅标签成员推理攻击

IF 2.6 4区 计算机科学 Q3 COMPUTER SCIENCE, ARTIFICIAL INTELLIGENCE
Yao Ma, Xurong Zhai, Dan Yu, Yuli Yang, Xingyu Wei, Yongle Chen
{"title":"基于模型解释的仅标签成员推理攻击","authors":"Yao Ma, Xurong Zhai, Dan Yu, Yuli Yang, Xingyu Wei, Yongle Chen","doi":"10.1007/s11063-024-11682-1","DOIUrl":null,"url":null,"abstract":"<p>It is well known that machine learning models (e.g., image recognition) can unintentionally leak information about the training set. Conventional membership inference relies on posterior vectors, and this task becomes extremely difficult when the posterior is masked. However, current label-only membership inference attacks require a large number of queries during the generation of adversarial samples, and thus incorrect inference generates a large number of invalid queries. Therefore, we introduce a label-only membership inference attack based on model explanations. It can transform a label-only attack into a traditional membership inference attack by observing neighborhood consistency and perform fine-grained membership inference for vulnerable samples. We use feature attribution to simplify the high-dimensional neighborhood sampling process, quickly identify decision boundaries and recover a posteriori vectors. It also compares different privacy risks faced by different samples through finding vulnerable samples. The method is validated on CIFAR-10, CIFAR-100 and MNIST datasets. The results show that membership attributes can be identified even using a simple sampling method. Furthermore, vulnerable samples expose the model to greater privacy risks.</p>","PeriodicalId":51144,"journal":{"name":"Neural Processing Letters","volume":"21 1","pages":""},"PeriodicalIF":2.6000,"publicationDate":"2024-09-18","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":"{\"title\":\"Label-Only Membership Inference Attack Based on Model Explanation\",\"authors\":\"Yao Ma, Xurong Zhai, Dan Yu, Yuli Yang, Xingyu Wei, Yongle Chen\",\"doi\":\"10.1007/s11063-024-11682-1\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"<p>It is well known that machine learning models (e.g., image recognition) can unintentionally leak information about the training set. Conventional membership inference relies on posterior vectors, and this task becomes extremely difficult when the posterior is masked. However, current label-only membership inference attacks require a large number of queries during the generation of adversarial samples, and thus incorrect inference generates a large number of invalid queries. Therefore, we introduce a label-only membership inference attack based on model explanations. It can transform a label-only attack into a traditional membership inference attack by observing neighborhood consistency and perform fine-grained membership inference for vulnerable samples. We use feature attribution to simplify the high-dimensional neighborhood sampling process, quickly identify decision boundaries and recover a posteriori vectors. It also compares different privacy risks faced by different samples through finding vulnerable samples. The method is validated on CIFAR-10, CIFAR-100 and MNIST datasets. The results show that membership attributes can be identified even using a simple sampling method. Furthermore, vulnerable samples expose the model to greater privacy risks.</p>\",\"PeriodicalId\":51144,\"journal\":{\"name\":\"Neural Processing Letters\",\"volume\":\"21 1\",\"pages\":\"\"},\"PeriodicalIF\":2.6000,\"publicationDate\":\"2024-09-18\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"0\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"Neural Processing Letters\",\"FirstCategoryId\":\"94\",\"ListUrlMain\":\"https://doi.org/10.1007/s11063-024-11682-1\",\"RegionNum\":4,\"RegionCategory\":\"计算机科学\",\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"Q3\",\"JCRName\":\"COMPUTER SCIENCE, ARTIFICIAL INTELLIGENCE\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"Neural Processing Letters","FirstCategoryId":"94","ListUrlMain":"https://doi.org/10.1007/s11063-024-11682-1","RegionNum":4,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q3","JCRName":"COMPUTER SCIENCE, ARTIFICIAL INTELLIGENCE","Score":null,"Total":0}
引用次数: 0

摘要

众所周知,机器学习模型(如图像识别)会无意中泄露训练集的信息。传统的成员推断依赖于后验向量,当后验向量被掩盖时,这项任务就变得异常困难。然而,目前的纯标签成员推断攻击在生成对抗样本时需要大量查询,因此错误的推断会产生大量无效查询。因此,我们引入了一种基于模型解释的纯标签成员推理攻击。它可以通过观察邻域一致性将纯标签攻击转化为传统的成员推断攻击,并对脆弱样本执行细粒度成员推断。我们利用特征归因来简化高维邻域采样过程,快速识别决策边界并恢复后验向量。它还通过寻找易受攻击样本,比较不同样本面临的不同隐私风险。该方法在 CIFAR-10、CIFAR-100 和 MNIST 数据集上进行了验证。结果表明,即使使用简单的抽样方法,也能识别成员属性。此外,脆弱样本会使模型面临更大的隐私风险。
本文章由计算机程序翻译,如有差异,请以英文原文为准。

Label-Only Membership Inference Attack Based on Model Explanation

Label-Only Membership Inference Attack Based on Model Explanation

It is well known that machine learning models (e.g., image recognition) can unintentionally leak information about the training set. Conventional membership inference relies on posterior vectors, and this task becomes extremely difficult when the posterior is masked. However, current label-only membership inference attacks require a large number of queries during the generation of adversarial samples, and thus incorrect inference generates a large number of invalid queries. Therefore, we introduce a label-only membership inference attack based on model explanations. It can transform a label-only attack into a traditional membership inference attack by observing neighborhood consistency and perform fine-grained membership inference for vulnerable samples. We use feature attribution to simplify the high-dimensional neighborhood sampling process, quickly identify decision boundaries and recover a posteriori vectors. It also compares different privacy risks faced by different samples through finding vulnerable samples. The method is validated on CIFAR-10, CIFAR-100 and MNIST datasets. The results show that membership attributes can be identified even using a simple sampling method. Furthermore, vulnerable samples expose the model to greater privacy risks.

求助全文
通过发布文献求助,成功后即可免费获取论文全文。 去求助
来源期刊
Neural Processing Letters
Neural Processing Letters 工程技术-计算机:人工智能
CiteScore
4.90
自引率
12.90%
发文量
392
审稿时长
2.8 months
期刊介绍: Neural Processing Letters is an international journal publishing research results and innovative ideas on all aspects of artificial neural networks. Coverage includes theoretical developments, biological models, new formal modes, learning, applications, software and hardware developments, and prospective researches. The journal promotes fast exchange of information in the community of neural network researchers and users. The resurgence of interest in the field of artificial neural networks since the beginning of the 1980s is coupled to tremendous research activity in specialized or multidisciplinary groups. Research, however, is not possible without good communication between people and the exchange of information, especially in a field covering such different areas; fast communication is also a key aspect, and this is the reason for Neural Processing Letters
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
确定
请完成安全验证×
copy
已复制链接
快去分享给好友吧!
我知道了
右上角分享
点击右上角分享
0
联系我们:info@booksci.cn Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。 Copyright © 2023 布克学术 All rights reserved.
京ICP备2023020795号-1
ghs 京公网安备 11010802042870号
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术官方微信