M. Kuribayashi, Takuro Tanaka, Shunta Suzuki, Tatsuya Yasui, N. Funabiki
{"title":"White-Box Watermarking Scheme for Fully-Connected Layers in Fine-Tuning Model","authors":"M. Kuribayashi, Takuro Tanaka, Shunta Suzuki, Tatsuya Yasui, N. Funabiki","doi":"10.1145/3437880.3460402","DOIUrl":null,"url":null,"abstract":"For the protection of trained deep neural network(DNN) models, embedding watermarks into the weights of the DNN model have been considered. However, the amount of change in the weights is large in the conventional methods, and it is reported that the existence of hidden watermark can be detected from the analysis of weight variance. This helps attackers to modify the watermark by effectively adding noise to the weight. In this paper, we focus on the fully-connected layers of fine-tuning models and apply a quantization-based watermarking method to the weights sampled from the layers. The advantage of the proposed method is that the change caused by watermark embedding is much smaller and the distortion converges gradually without using any loss function. The validity of the proposed method was evaluated by varying the conditions during the training of DNN model. The results shows the impact of training for DNN model, effectiveness of the embedding method, and high robustness against pruning attacks.","PeriodicalId":120300,"journal":{"name":"Proceedings of the 2021 ACM Workshop on Information Hiding and Multimedia Security","volume":null,"pages":null},"PeriodicalIF":0.0000,"publicationDate":"2021-06-17","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"14","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Proceedings of the 2021 ACM Workshop on Information Hiding and Multimedia Security","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1145/3437880.3460402","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 14
Abstract
For the protection of trained deep neural network(DNN) models, embedding watermarks into the weights of the DNN model have been considered. However, the amount of change in the weights is large in the conventional methods, and it is reported that the existence of hidden watermark can be detected from the analysis of weight variance. This helps attackers to modify the watermark by effectively adding noise to the weight. In this paper, we focus on the fully-connected layers of fine-tuning models and apply a quantization-based watermarking method to the weights sampled from the layers. The advantage of the proposed method is that the change caused by watermark embedding is much smaller and the distortion converges gradually without using any loss function. The validity of the proposed method was evaluated by varying the conditions during the training of DNN model. The results shows the impact of training for DNN model, effectiveness of the embedding method, and high robustness against pruning attacks.