An Explainable Intellectual Property Protection Method for Deep Neural Networks Based on Intrinsic Features

IEEE transactions on artificial intelligence Pub Date : 2024-04-16 DOI:10.1109/TAI.2024.3388389

Mingfu Xue;Xin Wang;Yinghao Wu;Shifeng Ni;Leo Yu Zhang;Yushu Zhang;Weiqiang Liu

{"title":"An Explainable Intellectual Property Protection Method for Deep Neural Networks Based on Intrinsic Features","authors":"Mingfu Xue;Xin Wang;Yinghao Wu;Shifeng Ni;Leo Yu Zhang;Yushu Zhang;Weiqiang Liu","doi":"10.1109/TAI.2024.3388389","DOIUrl":null,"url":null,"abstract":"Intellectual property (IP) protection for deep neural networks (DNNs) has raised serious concerns in recent years. Most existing works embed watermarks in the DNN model for IP protection, which need to modify the model and do not consider/mention interpretability. In this article, for the first time, we propose an interpretable IP protection method for DNN based on explainable artificial intelligence. Compared with existing works, the proposed method does not modify the DNN model, and the decision of the ownership verification is interpretable. We extract the intrinsic features of the DNN model by using deep Taylor decomposition. Since the intrinsic feature is composed of unique interpretation of the model's decision, the intrinsic feature can be regarded as fingerprint of the model. If the fingerprint of a suspected model is the same as the original model, the suspected model is considered as a pirated model. Experimental results demonstrate that the fingerprints can be successfully used to verify the ownership of the model and the test accuracy of the model is not affected. Furthermore, the proposed method is robust to fine-tuning attack, pruning attack, watermark overwriting attack, and adaptive attack.","PeriodicalId":73305,"journal":{"name":"IEEE transactions on artificial intelligence","volume":"5 9","pages":"4649-4659"},"PeriodicalIF":0.0000,"publicationDate":"2024-04-16","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"IEEE transactions on artificial intelligence","FirstCategoryId":"1085","ListUrlMain":"https://ieeexplore.ieee.org/document/10500746/","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}

引用次数: 0

Abstract

Intellectual property (IP) protection for deep neural networks (DNNs) has raised serious concerns in recent years. Most existing works embed watermarks in the DNN model for IP protection, which need to modify the model and do not consider/mention interpretability. In this article, for the first time, we propose an interpretable IP protection method for DNN based on explainable artificial intelligence. Compared with existing works, the proposed method does not modify the DNN model, and the decision of the ownership verification is interpretable. We extract the intrinsic features of the DNN model by using deep Taylor decomposition. Since the intrinsic feature is composed of unique interpretation of the model's decision, the intrinsic feature can be regarded as fingerprint of the model. If the fingerprint of a suspected model is the same as the original model, the suspected model is considered as a pirated model. Experimental results demonstrate that the fingerprints can be successfully used to verify the ownership of the model and the test accuracy of the model is not affected. Furthermore, the proposed method is robust to fine-tuning attack, pruning attack, watermark overwriting attack, and adaptive attack.

查看原文本刊更多论文

基于内在特征的可解释深度神经网络知识产权保护方法

近年来，深度神经网络（DNN）的知识产权（IP）保护引起了人们的严重关注。现有研究大多在 DNN 模型中嵌入水印进行知识产权保护，这需要修改模型，且没有考虑/提及可解释性。本文首次提出了一种基于可解释人工智能的 DNN 可解释知识产权保护方法。与现有方法相比，本文提出的方法不需要修改 DNN 模型，而且所有权验证的决定是可解释的。我们利用深度泰勒分解法提取 DNN 模型的内在特征。由于内在特征是由对模型判定的唯一解释组成的，因此内在特征可视为模型的指纹。如果可疑模型的指纹与原始模型相同，则该可疑模型被视为盗版模型。实验结果表明，指纹可成功用于验证模型的所有权，模型的测试准确性不受影响。此外，所提出的方法对微调攻击、剪枝攻击、水印覆盖攻击和自适应攻击具有鲁棒性。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

求助全文

约1分钟内获得全文求助全文

来源期刊

IEEE transactions on artificial intelligence

CiteScore

7.70

自引率

0.00%

发文量