Deep Unlearning via Randomized Conditionally Independent Hessians.

Ronak Mehta, Sourav Pal, Vikas Singh, Sathya N Ravi
{"title":"Deep Unlearning via Randomized Conditionally Independent Hessians.","authors":"Ronak Mehta, Sourav Pal, Vikas Singh, Sathya N Ravi","doi":"10.1109/cvpr52688.2022.01017","DOIUrl":null,"url":null,"abstract":"<p><p><i>Recent legislation has led to interest in</i> machine unlearning, <i>i.e., removing specific training samples from a</i> predictive <i>model as if they never existed in the training dataset. Unlearning may also be required due to corrupted/adversarial data or simply a user's updated privacy requirement. For models which require no training (k-NN), simply deleting the closest original sample can be effective. But this idea is inapplicable to models which learn richer representations. Recent ideas leveraging optimization-based updates scale poorly with the model dimension d, due to inverting the Hessian of the loss function. We use a variant of a new conditional independence coefficient, L-CODEC, to identify a subset of the model parameters with the most semantic overlap on an individual sample level. Our approach completely avoids the need to invert a (possibly) huge matrix. By utilizing a Markov blanket selection, we premise that L-CODEC is also suitable for deep unlearning, as well as other applications in vision. Compared to alternatives, L-CODEC makes approximate unlearning possible in settings that would otherwise be infeasible, including vision models used for face recognition, person re-identification and NLP models that may require unlearning samples identified for exclusion. Code is available at</i> https://github.com/vsingh-group/LCODEC-deep-unlearning.</p>","PeriodicalId":74560,"journal":{"name":"Proceedings. IEEE Computer Society Conference on Computer Vision and Pattern Recognition","volume":"2022 ","pages":"10412-10421"},"PeriodicalIF":0.0000,"publicationDate":"2022-06-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC10337718/pdf/nihms-1894549.pdf","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Proceedings. IEEE Computer Society Conference on Computer Vision and Pattern Recognition","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/cvpr52688.2022.01017","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"2022/9/27 0:00:00","PubModel":"Epub","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 0

Abstract

Recent legislation has led to interest in machine unlearning, i.e., removing specific training samples from a predictive model as if they never existed in the training dataset. Unlearning may also be required due to corrupted/adversarial data or simply a user's updated privacy requirement. For models which require no training (k-NN), simply deleting the closest original sample can be effective. But this idea is inapplicable to models which learn richer representations. Recent ideas leveraging optimization-based updates scale poorly with the model dimension d, due to inverting the Hessian of the loss function. We use a variant of a new conditional independence coefficient, L-CODEC, to identify a subset of the model parameters with the most semantic overlap on an individual sample level. Our approach completely avoids the need to invert a (possibly) huge matrix. By utilizing a Markov blanket selection, we premise that L-CODEC is also suitable for deep unlearning, as well as other applications in vision. Compared to alternatives, L-CODEC makes approximate unlearning possible in settings that would otherwise be infeasible, including vision models used for face recognition, person re-identification and NLP models that may require unlearning samples identified for exclusion. Code is available at https://github.com/vsingh-group/LCODEC-deep-unlearning.

Abstract Image

Abstract Image

通过随机条件独立哈希值进行深度非学习
最近的立法引起了人们对机器取消学习的兴趣,即从预测模型中删除特定的训练样本,就好像这些样本从未出现在训练数据集中一样。由于数据被破坏/具有对抗性,或者仅仅是用户更新了隐私要求,也可能需要取消学习。对于无需训练的模型(k-NN),只需删除最接近的原始样本即可有效。但这种想法不适用于学习更丰富表征的模型。最近的一些想法利用了基于优化的更新,但随着模型维度 d 的增加,效果不佳,这是因为损失函数的 Hessian 会倒置。我们使用一种新的条件独立性系数 L-CODEC 的变体,来识别在单个样本层面上语义重叠最多的模型参数子集。我们的方法完全避免了反转一个(可能)巨大矩阵的需要。通过利用马尔可夫空白选择,我们认为 L-CODEC 也适用于深度学习以及视觉领域的其他应用。与其他替代方法相比,L-CODEC 使近似解学习成为可能,否则这些方法将不可行,包括用于人脸识别、人物再识别的视觉模型,以及可能需要解学习被识别为排除样本的 NLP 模型。代码见 https://github.com/vsingh-group/LCODEC-deep-unlearning。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
求助全文
约1分钟内获得全文 求助全文
来源期刊
CiteScore
43.50
自引率
0.00%
发文量
0
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
确定
请完成安全验证×
copy
已复制链接
快去分享给好友吧!
我知道了
右上角分享
点击右上角分享
0
联系我们:info@booksci.cn Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。 Copyright © 2023 布克学术 All rights reserved.
京ICP备2023020795号-1
ghs 京公网安备 11010802042870号
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术官方微信