通过随机条件独立哈希值进行深度非学习

Proceedings. IEEE Computer Society Conference on Computer Vision and Pattern Recognition Pub Date : 2022-06-01 Epub Date: 2022-09-27 DOI:10.1109/cvpr52688.2022.01017

Ronak Mehta, Sourav Pal, Vikas Singh, Sathya N Ravi

{"title":"通过随机条件独立哈希值进行深度非学习","authors":"Ronak Mehta, Sourav Pal, Vikas Singh, Sathya N Ravi","doi":"10.1109/cvpr52688.2022.01017","DOIUrl":null,"url":null,"abstract":"Recent legislation has led to interest in machine unlearning, i.e., removing specific training samples from a predictive model as if they never existed in the training dataset. Unlearning may also be required due to corrupted/adversarial data or simply a user's updated privacy requirement. For models which require no training (k-NN), simply deleting the closest original sample can be effective. But this idea is inapplicable to models which learn richer representations. Recent ideas leveraging optimization-based updates scale poorly with the model dimension d, due to inverting the Hessian of the loss function. We use a variant of a new conditional independence coefficient, L-CODEC, to identify a subset of the model parameters with the most semantic overlap on an individual sample level. Our approach completely avoids the need to invert a (possibly) huge matrix. By utilizing a Markov blanket selection, we premise that L-CODEC is also suitable for deep unlearning, as well as other applications in vision. Compared to alternatives, L-CODEC makes approximate unlearning possible in settings that would otherwise be infeasible, including vision models used for face recognition, person re-identification and NLP models that may require unlearning samples identified for exclusion. Code is available at https://github.com/vsingh-group/LCODEC-deep-unlearning.","PeriodicalId":74560,"journal":{"name":"Proceedings. IEEE Computer Society Conference on Computer Vision and Pattern Recognition","volume":"2022 ","pages":"10412-10421"},"PeriodicalIF":0.0000,"publicationDate":"2022-06-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC10337718/pdf/nihms-1894549.pdf","citationCount":"0","resultStr":"{\"title\":\"Deep Unlearning via Randomized Conditionally Independent Hessians.\",\"authors\":\"Ronak Mehta, Sourav Pal, Vikas Singh, Sathya N Ravi\",\"doi\":\"10.1109/cvpr52688.2022.01017\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"Recent legislation has led to interest in machine unlearning, i.e., removing specific training samples from a predictive model as if they never existed in the training dataset. Unlearning may also be required due to corrupted/adversarial data or simply a user's updated privacy requirement. For models which require no training (k-NN), simply deleting the closest original sample can be effective. But this idea is inapplicable to models which learn richer representations. Recent ideas leveraging optimization-based updates scale poorly with the model dimension d, due to inverting the Hessian of the loss function. We use a variant of a new conditional independence coefficient, L-CODEC, to identify a subset of the model parameters with the most semantic overlap on an individual sample level. Our approach completely avoids the need to invert a (possibly) huge matrix. By utilizing a Markov blanket selection, we premise that L-CODEC is also suitable for deep unlearning, as well as other applications in vision. Compared to alternatives, L-CODEC makes approximate unlearning possible in settings that would otherwise be infeasible, including vision models used for face recognition, person re-identification and NLP models that may require unlearning samples identified for exclusion. Code is available at https://github.com/vsingh-group/LCODEC-deep-unlearning.\",\"PeriodicalId\":74560,\"journal\":{\"name\":\"Proceedings. IEEE Computer Society Conference on Computer Vision and Pattern Recognition\",\"volume\":\"2022 \",\"pages\":\"10412-10421\"},\"PeriodicalIF\":0.0000,\"publicationDate\":\"2022-06-01\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC10337718/pdf/nihms-1894549.pdf\",\"citationCount\":\"0\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"Proceedings. IEEE Computer Society Conference on Computer Vision and Pattern Recognition\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/10.1109/cvpr52688.2022.01017\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"2022/9/27 0:00:00\",\"PubModel\":\"Epub\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"Proceedings. IEEE Computer Society Conference on Computer Vision and Pattern Recognition","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/cvpr52688.2022.01017","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"2022/9/27 0:00:00","PubModel":"Epub","JCR":"","JCRName":"","Score":null,"Total":0}

引用次数: 0

摘要

最近的立法引起了人们对机器取消学习的兴趣，即从预测模型中删除特定的训练样本，就好像这些样本从未出现在训练数据集中一样。由于数据被破坏/具有对抗性，或者仅仅是用户更新了隐私要求，也可能需要取消学习。对于无需训练的模型（k-NN），只需删除最接近的原始样本即可有效。但这种想法不适用于学习更丰富表征的模型。最近的一些想法利用了基于优化的更新，但随着模型维度 d 的增加，效果不佳，这是因为损失函数的 Hessian 会倒置。我们使用一种新的条件独立性系数 L-CODEC 的变体，来识别在单个样本层面上语义重叠最多的模型参数子集。我们的方法完全避免了反转一个（可能）巨大矩阵的需要。通过利用马尔可夫空白选择，我们认为 L-CODEC 也适用于深度学习以及视觉领域的其他应用。与其他替代方法相比，L-CODEC 使近似解学习成为可能，否则这些方法将不可行，包括用于人脸识别、人物再识别的视觉模型，以及可能需要解学习被识别为排除样本的 NLP 模型。代码见 https://github.com/vsingh-group/LCODEC-deep-unlearning。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

Deep Unlearning via Randomized Conditionally Independent Hessians.

查看原文本刊更多论文

Deep Unlearning via Randomized Conditionally Independent Hessians.

Recent legislation has led to interest in machine unlearning, i.e., removing specific training samples from a predictive model as if they never existed in the training dataset. Unlearning may also be required due to corrupted/adversarial data or simply a user's updated privacy requirement. For models which require no training (k-NN), simply deleting the closest original sample can be effective. But this idea is inapplicable to models which learn richer representations. Recent ideas leveraging optimization-based updates scale poorly with the model dimension d, due to inverting the Hessian of the loss function. We use a variant of a new conditional independence coefficient, L-CODEC, to identify a subset of the model parameters with the most semantic overlap on an individual sample level. Our approach completely avoids the need to invert a (possibly) huge matrix. By utilizing a Markov blanket selection, we premise that L-CODEC is also suitable for deep unlearning, as well as other applications in vision. Compared to alternatives, L-CODEC makes approximate unlearning possible in settings that would otherwise be infeasible, including vision models used for face recognition, person re-identification and NLP models that may require unlearning samples identified for exclusion. Code is available at https://github.com/vsingh-group/LCODEC-deep-unlearning.

求助全文

通过发布文献求助，成功后即可免费获取论文全文。去求助

来源期刊

Proceedings. IEEE Computer Society Conference on Computer Vision and Pattern Recognition

CiteScore

43.50

自引率

0.00%

发文量