福克斯:玩弄解释:隐私保护与社交媒体中的敌对反应

2021 18th International Conference on Privacy, Security and Trust (PST) Pub Date : 2021-12-13 DOI:10.1109/PST52912.2021.9647778

Noreddine Belhadj Cheikh, Abdessamad Imine, M. Rusinowitch

{"title":"福克斯:玩弄解释:隐私保护与社交媒体中的敌对反应","authors":"Noreddine Belhadj Cheikh, Abdessamad Imine, M. Rusinowitch","doi":"10.1109/PST52912.2021.9647778","DOIUrl":null,"url":null,"abstract":"Socia1 media data has been mined over the years to predict individual sensitive attributes such as political and religious beliefs. Indeed, mining such data can improve the user experience with personalization and freemium services. Still, it can also be harmful and discriminative when used to make critical decisions, such as employment. In this work, we investigate social media privacy protection against attribute inference attacks using machine learning explainability and adversarial defense strategies. More precisely, we propose FOX (FOoling with eXplanations), an adversarial attack framework to explain and fool sensitive attribute inference models by generating effective adversarial reactions. We evaluate the performance of FOX with other SOTA baselines in a black-box setting by attacking five gender attribute classifiers trained on Facebook pictures reactions, specifically (i) comments generated by Facebook users excluding the picture owner, and (ii) textual tags (i.e., alttext) generated by Facebook. Our experiments show that FOX successfully fools (about 99.7% and 93.2% of the time) the classifiers, outperforms the SOTA baselines and gives a good transferability of adversarial features.","PeriodicalId":144610,"journal":{"name":"2021 18th International Conference on Privacy, Security and Trust (PST)","volume":"191 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2021-12-13","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":"{\"title\":\"FOX: Fooling with Explanations : Privacy Protection with Adversarial Reactions in Social Media\",\"authors\":\"Noreddine Belhadj Cheikh, Abdessamad Imine, M. Rusinowitch\",\"doi\":\"10.1109/PST52912.2021.9647778\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"Socia1 media data has been mined over the years to predict individual sensitive attributes such as political and religious beliefs. Indeed, mining such data can improve the user experience with personalization and freemium services. Still, it can also be harmful and discriminative when used to make critical decisions, such as employment. In this work, we investigate social media privacy protection against attribute inference attacks using machine learning explainability and adversarial defense strategies. More precisely, we propose FOX (FOoling with eXplanations), an adversarial attack framework to explain and fool sensitive attribute inference models by generating effective adversarial reactions. We evaluate the performance of FOX with other SOTA baselines in a black-box setting by attacking five gender attribute classifiers trained on Facebook pictures reactions, specifically (i) comments generated by Facebook users excluding the picture owner, and (ii) textual tags (i.e., alttext) generated by Facebook. Our experiments show that FOX successfully fools (about 99.7% and 93.2% of the time) the classifiers, outperforms the SOTA baselines and gives a good transferability of adversarial features.\",\"PeriodicalId\":144610,\"journal\":{\"name\":\"2021 18th International Conference on Privacy, Security and Trust (PST)\",\"volume\":\"191 1\",\"pages\":\"0\"},\"PeriodicalIF\":0.0000,\"publicationDate\":\"2021-12-13\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"0\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"2021 18th International Conference on Privacy, Security and Trust (PST)\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/10.1109/PST52912.2021.9647778\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"2021 18th International Conference on Privacy, Security and Trust (PST)","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/PST52912.2021.9647778","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}

引用次数: 0

摘要

多年来，人们一直在挖掘社交媒体数据，以预测政治和宗教信仰等个人敏感属性。事实上，挖掘这些数据可以通过个性化和免费增值服务改善用户体验。然而，在做出关键决策(如就业)时，它也可能是有害的和歧视性的。在这项工作中，我们使用机器学习可解释性和对抗性防御策略研究社交媒体隐私保护免受属性推理攻击。更准确地说，我们提出了FOX(愚弄解释)，这是一个对抗性攻击框架，通过产生有效的对抗性反应来解释和愚弄敏感属性推理模型。我们通过攻击在Facebook图片反应上训练的五种性别属性分类器，在黑盒设置中评估FOX与其他SOTA基线的性能，特别是(i) Facebook用户(不包括图片所有者)生成的评论，以及(ii) Facebook生成的文本标签(即altttext)。我们的实验表明，FOX成功地欺骗了分类器(大约99.7%和93.2%的时间)，优于SOTA基线，并提供了良好的对抗性特征的可转移性。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

查看原文本刊更多论文

FOX: Fooling with Explanations : Privacy Protection with Adversarial Reactions in Social Media

Socia1 media data has been mined over the years to predict individual sensitive attributes such as political and religious beliefs. Indeed, mining such data can improve the user experience with personalization and freemium services. Still, it can also be harmful and discriminative when used to make critical decisions, such as employment. In this work, we investigate social media privacy protection against attribute inference attacks using machine learning explainability and adversarial defense strategies. More precisely, we propose FOX (FOoling with eXplanations), an adversarial attack framework to explain and fool sensitive attribute inference models by generating effective adversarial reactions. We evaluate the performance of FOX with other SOTA baselines in a black-box setting by attacking five gender attribute classifiers trained on Facebook pictures reactions, specifically (i) comments generated by Facebook users excluding the picture owner, and (ii) textual tags (i.e., alttext) generated by Facebook. Our experiments show that FOX successfully fools (about 99.7% and 93.2% of the time) the classifiers, outperforms the SOTA baselines and gives a good transferability of adversarial features.

求助全文

通过发布文献求助，成功后即可免费获取论文全文。去求助

来源期刊

2021 18th International Conference on Privacy, Security and Trust (PST)

自引率

0.00%

发文量