从模型解释中重建模型

Proceedings of the Conference on Fairness, Accountability, and Transparency Pub Date : 2018-07-13 DOI:10.1145/3287560.3287562

S. Milli, Ludwig Schmidt, A. Dragan, Moritz Hardt

{"title":"从模型解释中重建模型","authors":"S. Milli, Ludwig Schmidt, A. Dragan, Moritz Hardt","doi":"10.1145/3287560.3287562","DOIUrl":null,"url":null,"abstract":"We show through theory and experiment that gradient-based explanations of a model quickly reveal the model itself. Our results speak to a tension between the desire to keep a proprietary model secret and the ability to offer model explanations. On the theoretical side, we give an algorithm that provably learns a two-layer ReLU network in a setting where the algorithm may query the gradient of the model with respect to chosen inputs. The number of queries is independent of the dimension and nearly optimal in its dependence on the model size. Of interest not only from a learning-theoretic perspective, this result highlights the power of gradients rather than labels as a learning primitive. Complementing our theory, we give effective heuristics for reconstructing models from gradient explanations that are orders of magnitude more query-efficient than reconstruction attacks relying on prediction interfaces.","PeriodicalId":20573,"journal":{"name":"Proceedings of the Conference on Fairness, Accountability, and Transparency","volume":"25 1","pages":""},"PeriodicalIF":0.0000,"publicationDate":"2018-07-13","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"131","resultStr":"{\"title\":\"Model Reconstruction from Model Explanations\",\"authors\":\"S. Milli, Ludwig Schmidt, A. Dragan, Moritz Hardt\",\"doi\":\"10.1145/3287560.3287562\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"We show through theory and experiment that gradient-based explanations of a model quickly reveal the model itself. Our results speak to a tension between the desire to keep a proprietary model secret and the ability to offer model explanations. On the theoretical side, we give an algorithm that provably learns a two-layer ReLU network in a setting where the algorithm may query the gradient of the model with respect to chosen inputs. The number of queries is independent of the dimension and nearly optimal in its dependence on the model size. Of interest not only from a learning-theoretic perspective, this result highlights the power of gradients rather than labels as a learning primitive. Complementing our theory, we give effective heuristics for reconstructing models from gradient explanations that are orders of magnitude more query-efficient than reconstruction attacks relying on prediction interfaces.\",\"PeriodicalId\":20573,\"journal\":{\"name\":\"Proceedings of the Conference on Fairness, Accountability, and Transparency\",\"volume\":\"25 1\",\"pages\":\"\"},\"PeriodicalIF\":0.0000,\"publicationDate\":\"2018-07-13\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"131\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"Proceedings of the Conference on Fairness, Accountability, and Transparency\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/10.1145/3287560.3287562\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"Proceedings of the Conference on Fairness, Accountability, and Transparency","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1145/3287560.3287562","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}

引用次数: 131

摘要

我们通过理论和实验证明，基于梯度的模型解释可以快速揭示模型本身。我们的研究结果说明了保持专有模型秘密的愿望与提供模型解释的能力之间的紧张关系。在理论方面，我们给出了一种算法，该算法可以在一个设置中证明学习两层ReLU网络，其中算法可以查询模型相对于所选输入的梯度。查询的数量与维度无关，并且与模型大小的依赖关系几乎是最佳的。不仅从学习理论的角度来看，这个结果突出了梯度而不是标签作为学习原语的力量。为了补充我们的理论，我们给出了有效的启发式方法，用于从梯度解释中重建模型，这比依赖预测接口的重建攻击的查询效率要高几个数量级。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

查看原文本刊更多论文

Model Reconstruction from Model Explanations

We show through theory and experiment that gradient-based explanations of a model quickly reveal the model itself. Our results speak to a tension between the desire to keep a proprietary model secret and the ability to offer model explanations. On the theoretical side, we give an algorithm that provably learns a two-layer ReLU network in a setting where the algorithm may query the gradient of the model with respect to chosen inputs. The number of queries is independent of the dimension and nearly optimal in its dependence on the model size. Of interest not only from a learning-theoretic perspective, this result highlights the power of gradients rather than labels as a learning primitive. Complementing our theory, we give effective heuristics for reconstructing models from gradient explanations that are orders of magnitude more query-efficient than reconstruction attacks relying on prediction interfaces.

求助全文

通过发布文献求助，成功后即可免费获取论文全文。去求助

来源期刊

Proceedings of the Conference on Fairness, Accountability, and Transparency

自引率

0.00%

发文量