Jiadong Lou, Xu Yuan, Miao Pan, Hao Wang, N. Tzeng
{"title":"针对半监督学习的数据隐私审查","authors":"Jiadong Lou, Xu Yuan, Miao Pan, Hao Wang, N. Tzeng","doi":"10.1145/3579856.3590333","DOIUrl":null,"url":null,"abstract":"Semi-supervised learning, which learns with only a small amount of labeled data while collecting voluminous unlabeled data to aid its training, has achieved promising performance lately, but it also raises a serious privacy concern: Whether a user’s data has been collected for use without authorization. In this paper, we propose a novel membership inference method against semi-supervised learning, serving to protect user data privacy. Due to involving both the labeled and unlabeled data, the membership patterns of semi-supervised learning’s training data cannot be well captured by the existing membership inference solutions. To this end, we propose two new metrics, i.e., inter-consistency and intra-entropy, tailored specifically to the semi-supervised learning paradigm, able to respectively measure the similarity and calculate the cross-entropy among prediction vectors from the perturbed versions. By exploiting the two metrics for membership inference, our method can dig out membership patterns imprinted on prediction outputs of semi-supervised learning models, thus facilitating effective membership inference. Extensive experiments have been conducted for comparing our method with five rectified baseline inference techniques across four datasets on six semi-supervised learning algorithms. Experimental results exhibit that our inference method achieves over 80% accuracy under each experimental setting, substantially outperforming all baseline techniques.","PeriodicalId":156082,"journal":{"name":"Proceedings of the 2023 ACM Asia Conference on Computer and Communications Security","volume":"111 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2023-07-10","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":"{\"title\":\"Data Privacy Examination against Semi-Supervised Learning\",\"authors\":\"Jiadong Lou, Xu Yuan, Miao Pan, Hao Wang, N. Tzeng\",\"doi\":\"10.1145/3579856.3590333\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"Semi-supervised learning, which learns with only a small amount of labeled data while collecting voluminous unlabeled data to aid its training, has achieved promising performance lately, but it also raises a serious privacy concern: Whether a user’s data has been collected for use without authorization. In this paper, we propose a novel membership inference method against semi-supervised learning, serving to protect user data privacy. Due to involving both the labeled and unlabeled data, the membership patterns of semi-supervised learning’s training data cannot be well captured by the existing membership inference solutions. To this end, we propose two new metrics, i.e., inter-consistency and intra-entropy, tailored specifically to the semi-supervised learning paradigm, able to respectively measure the similarity and calculate the cross-entropy among prediction vectors from the perturbed versions. By exploiting the two metrics for membership inference, our method can dig out membership patterns imprinted on prediction outputs of semi-supervised learning models, thus facilitating effective membership inference. Extensive experiments have been conducted for comparing our method with five rectified baseline inference techniques across four datasets on six semi-supervised learning algorithms. Experimental results exhibit that our inference method achieves over 80% accuracy under each experimental setting, substantially outperforming all baseline techniques.\",\"PeriodicalId\":156082,\"journal\":{\"name\":\"Proceedings of the 2023 ACM Asia Conference on Computer and Communications Security\",\"volume\":\"111 1\",\"pages\":\"0\"},\"PeriodicalIF\":0.0000,\"publicationDate\":\"2023-07-10\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"0\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"Proceedings of the 2023 ACM Asia Conference on Computer and Communications Security\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/10.1145/3579856.3590333\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"Proceedings of the 2023 ACM Asia Conference on Computer and Communications Security","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1145/3579856.3590333","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
Data Privacy Examination against Semi-Supervised Learning
Semi-supervised learning, which learns with only a small amount of labeled data while collecting voluminous unlabeled data to aid its training, has achieved promising performance lately, but it also raises a serious privacy concern: Whether a user’s data has been collected for use without authorization. In this paper, we propose a novel membership inference method against semi-supervised learning, serving to protect user data privacy. Due to involving both the labeled and unlabeled data, the membership patterns of semi-supervised learning’s training data cannot be well captured by the existing membership inference solutions. To this end, we propose two new metrics, i.e., inter-consistency and intra-entropy, tailored specifically to the semi-supervised learning paradigm, able to respectively measure the similarity and calculate the cross-entropy among prediction vectors from the perturbed versions. By exploiting the two metrics for membership inference, our method can dig out membership patterns imprinted on prediction outputs of semi-supervised learning models, thus facilitating effective membership inference. Extensive experiments have been conducted for comparing our method with five rectified baseline inference techniques across four datasets on six semi-supervised learning algorithms. Experimental results exhibit that our inference method achieves over 80% accuracy under each experimental setting, substantially outperforming all baseline techniques.