转置攻击:用双向训练窃取数据集

arXiv (Cornell University) Pub Date : 2023-11-13 DOI:10.48550/arxiv.2311.07389

Amit, Guy, Levy, Mosh, Mirsky, Yisroel

{"title":"转置攻击:用双向训练窃取数据集","authors":"Amit, Guy, Levy, Mosh, Mirsky, Yisroel","doi":"10.48550/arxiv.2311.07389","DOIUrl":null,"url":null,"abstract":"Deep neural networks are normally executed in the forward direction. However, in this work, we identify a vulnerability that enables models to be trained in both directions and on different tasks. Adversaries can exploit this capability to hide rogue models within seemingly legitimate models. In addition, in this work we show that neural networks can be taught to systematically memorize and retrieve specific samples from datasets. Together, these findings expose a novel method in which adversaries can exfiltrate datasets from protected learning environments under the guise of legitimate models. We focus on the data exfiltration attack and show that modern architectures can be used to secretly exfiltrate tens of thousands of samples with high fidelity, high enough to compromise data privacy and even train new models. Moreover, to mitigate this threat we propose a novel approach for detecting infected models.","PeriodicalId":496270,"journal":{"name":"arXiv (Cornell University)","volume":"110 15","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2023-11-13","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":"{\"title\":\"Transpose Attack: Stealing Datasets with Bidirectional Training\",\"authors\":\"Amit, Guy, Levy, Mosh, Mirsky, Yisroel\",\"doi\":\"10.48550/arxiv.2311.07389\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"Deep neural networks are normally executed in the forward direction. However, in this work, we identify a vulnerability that enables models to be trained in both directions and on different tasks. Adversaries can exploit this capability to hide rogue models within seemingly legitimate models. In addition, in this work we show that neural networks can be taught to systematically memorize and retrieve specific samples from datasets. Together, these findings expose a novel method in which adversaries can exfiltrate datasets from protected learning environments under the guise of legitimate models. We focus on the data exfiltration attack and show that modern architectures can be used to secretly exfiltrate tens of thousands of samples with high fidelity, high enough to compromise data privacy and even train new models. Moreover, to mitigate this threat we propose a novel approach for detecting infected models.\",\"PeriodicalId\":496270,\"journal\":{\"name\":\"arXiv (Cornell University)\",\"volume\":\"110 15\",\"pages\":\"0\"},\"PeriodicalIF\":0.0000,\"publicationDate\":\"2023-11-13\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"0\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"arXiv (Cornell University)\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/10.48550/arxiv.2311.07389\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"arXiv (Cornell University)","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.48550/arxiv.2311.07389","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}

引用次数: 0

摘要

深度神经网络通常在正向方向上执行。然而，在这项工作中，我们确定了一个漏洞，使模型能够在两个方向和不同的任务上进行训练。攻击者可以利用这种能力在看似合法的模型中隐藏流氓模型。此外，在这项工作中，我们表明神经网络可以被教导系统地记忆和检索数据集中的特定样本。总之，这些发现揭示了一种新的方法，攻击者可以在合法模型的幌子下从受保护的学习环境中窃取数据集。我们专注于数据泄露攻击，并展示了现代架构可以使用高保真度秘密泄露数万个样本，高到足以损害数据隐私甚至训练新模型。此外，为了减轻这种威胁，我们提出了一种检测受感染模型的新方法。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

查看原文本刊更多论文

Transpose Attack: Stealing Datasets with Bidirectional Training

Deep neural networks are normally executed in the forward direction. However, in this work, we identify a vulnerability that enables models to be trained in both directions and on different tasks. Adversaries can exploit this capability to hide rogue models within seemingly legitimate models. In addition, in this work we show that neural networks can be taught to systematically memorize and retrieve specific samples from datasets. Together, these findings expose a novel method in which adversaries can exfiltrate datasets from protected learning environments under the guise of legitimate models. We focus on the data exfiltration attack and show that modern architectures can be used to secretly exfiltrate tens of thousands of samples with high fidelity, high enough to compromise data privacy and even train new models. Moreover, to mitigate this threat we propose a novel approach for detecting infected models.

求助全文

通过发布文献求助，成功后即可免费获取论文全文。去求助

来源期刊

arXiv (Cornell University)

自引率

0.00%

发文量