针对半监督学习的数据隐私审查

Jiadong Lou, Xu Yuan, Miao Pan, Hao Wang, N. Tzeng
{"title":"针对半监督学习的数据隐私审查","authors":"Jiadong Lou, Xu Yuan, Miao Pan, Hao Wang, N. Tzeng","doi":"10.1145/3579856.3590333","DOIUrl":null,"url":null,"abstract":"Semi-supervised learning, which learns with only a small amount of labeled data while collecting voluminous unlabeled data to aid its training, has achieved promising performance lately, but it also raises a serious privacy concern: Whether a user’s data has been collected for use without authorization. In this paper, we propose a novel membership inference method against semi-supervised learning, serving to protect user data privacy. Due to involving both the labeled and unlabeled data, the membership patterns of semi-supervised learning’s training data cannot be well captured by the existing membership inference solutions. To this end, we propose two new metrics, i.e., inter-consistency and intra-entropy, tailored specifically to the semi-supervised learning paradigm, able to respectively measure the similarity and calculate the cross-entropy among prediction vectors from the perturbed versions. By exploiting the two metrics for membership inference, our method can dig out membership patterns imprinted on prediction outputs of semi-supervised learning models, thus facilitating effective membership inference. Extensive experiments have been conducted for comparing our method with five rectified baseline inference techniques across four datasets on six semi-supervised learning algorithms. Experimental results exhibit that our inference method achieves over 80% accuracy under each experimental setting, substantially outperforming all baseline techniques.","PeriodicalId":156082,"journal":{"name":"Proceedings of the 2023 ACM Asia Conference on Computer and Communications Security","volume":"111 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2023-07-10","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":"{\"title\":\"Data Privacy Examination against Semi-Supervised Learning\",\"authors\":\"Jiadong Lou, Xu Yuan, Miao Pan, Hao Wang, N. Tzeng\",\"doi\":\"10.1145/3579856.3590333\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"Semi-supervised learning, which learns with only a small amount of labeled data while collecting voluminous unlabeled data to aid its training, has achieved promising performance lately, but it also raises a serious privacy concern: Whether a user’s data has been collected for use without authorization. In this paper, we propose a novel membership inference method against semi-supervised learning, serving to protect user data privacy. Due to involving both the labeled and unlabeled data, the membership patterns of semi-supervised learning’s training data cannot be well captured by the existing membership inference solutions. To this end, we propose two new metrics, i.e., inter-consistency and intra-entropy, tailored specifically to the semi-supervised learning paradigm, able to respectively measure the similarity and calculate the cross-entropy among prediction vectors from the perturbed versions. By exploiting the two metrics for membership inference, our method can dig out membership patterns imprinted on prediction outputs of semi-supervised learning models, thus facilitating effective membership inference. Extensive experiments have been conducted for comparing our method with five rectified baseline inference techniques across four datasets on six semi-supervised learning algorithms. Experimental results exhibit that our inference method achieves over 80% accuracy under each experimental setting, substantially outperforming all baseline techniques.\",\"PeriodicalId\":156082,\"journal\":{\"name\":\"Proceedings of the 2023 ACM Asia Conference on Computer and Communications Security\",\"volume\":\"111 1\",\"pages\":\"0\"},\"PeriodicalIF\":0.0000,\"publicationDate\":\"2023-07-10\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"0\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"Proceedings of the 2023 ACM Asia Conference on Computer and Communications Security\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/10.1145/3579856.3590333\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"Proceedings of the 2023 ACM Asia Conference on Computer and Communications Security","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1145/3579856.3590333","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 0

摘要

半监督学习仅使用少量标记数据进行学习,同时收集大量未标记数据来辅助训练,最近取得了很好的表现,但它也引发了一个严重的隐私问题:用户的数据是否被未经授权收集使用。在本文中,我们提出了一种新的针对半监督学习的隶属度推理方法,以保护用户数据隐私。由于半监督学习训练数据同时包含有标记和未标记数据,现有的隶属度推理方案无法很好地捕捉到训练数据的隶属度模式。为此,我们提出了专门针对半监督学习范式的两个新指标,即inter-consistency和intra-entropy,它们能够分别测量来自扰动版本的预测向量之间的相似性和计算交叉熵。通过利用这两个指标进行隶属度推理,我们的方法可以挖掘出印在半监督学习模型预测输出上的隶属度模式,从而促进有效的隶属度推理。已经进行了大量的实验,将我们的方法与六种半监督学习算法上跨越四个数据集的五种校正基线推理技术进行了比较。实验结果表明,我们的推理方法在每个实验设置下都达到了80%以上的准确率,大大优于所有基线技术。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
Data Privacy Examination against Semi-Supervised Learning
Semi-supervised learning, which learns with only a small amount of labeled data while collecting voluminous unlabeled data to aid its training, has achieved promising performance lately, but it also raises a serious privacy concern: Whether a user’s data has been collected for use without authorization. In this paper, we propose a novel membership inference method against semi-supervised learning, serving to protect user data privacy. Due to involving both the labeled and unlabeled data, the membership patterns of semi-supervised learning’s training data cannot be well captured by the existing membership inference solutions. To this end, we propose two new metrics, i.e., inter-consistency and intra-entropy, tailored specifically to the semi-supervised learning paradigm, able to respectively measure the similarity and calculate the cross-entropy among prediction vectors from the perturbed versions. By exploiting the two metrics for membership inference, our method can dig out membership patterns imprinted on prediction outputs of semi-supervised learning models, thus facilitating effective membership inference. Extensive experiments have been conducted for comparing our method with five rectified baseline inference techniques across four datasets on six semi-supervised learning algorithms. Experimental results exhibit that our inference method achieves over 80% accuracy under each experimental setting, substantially outperforming all baseline techniques.
求助全文
通过发布文献求助,成功后即可免费获取论文全文。 去求助
来源期刊
自引率
0.00%
发文量
0
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
确定
请完成安全验证×
copy
已复制链接
快去分享给好友吧!
我知道了
右上角分享
点击右上角分享
0
联系我们:info@booksci.cn Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。 Copyright © 2023 布克学术 All rights reserved.
京ICP备2023020795号-1
ghs 京公网安备 11010802042870号
Book学术文献互助
Book学术文献互助群
群 号:604180095
Book学术官方微信