Nandish Chattopadhyay, Rajan Kataria, A. Chattopadhyay
{"title":"TextBack:水印文本分类器使用后门","authors":"Nandish Chattopadhyay, Rajan Kataria, A. Chattopadhyay","doi":"10.1109/DSD57027.2022.00053","DOIUrl":null,"url":null,"abstract":"Creating high performance neural networks is ex-pensive, incurring costs that can be attributed to data collection and curation, neural architecture search and training on dedi-cated hardware accelerators. Stakeholders invested in any one or more of these aspects of deep neural network training expect as-surances on ownership and guarantees that unauthorised usage is detectable and therefore preventable. Watermarking the trained neural architectures can prove to be a solution to this. While such techniques have been demonstrated in image classification tasks, we posit that a watermarking scheme can be developed for natural language processing applications as well. In this paper, we propose TextBack, which is a watermarking technique developed for text classifiers using backdooring. We have tested for the functionality preserving properties and verifiable proof of ownership of TextBack on multiple neural architectures and datasets for text classification tasks. The watermarked models consistently generate accuracies within a range of 1 - 2% of models without any watermarking, whilst being reliably verifiable during watermarking verification. TextBack has been tested on two different kinds of Trigger Sets, which can be chosen by the owner as preferred. We have studied the efficiencies of the algorithm that embeds the watermarks by fine tuning using a combination of Trigger samples and clean samples. The benefit of using TextBack's fine tuning approach on pre-trained models from a computational cost perspective against embedding watermarks by training models from scratch is also established experimentally. This watermarking scheme is not computation intensive and adds no additional burden to the neural architecture. This makes TextBack suitable for lightweight applications on edge devices as the watermarked model can be deployed on resource-constrained hardware and SoCs when required.","PeriodicalId":211723,"journal":{"name":"2022 25th Euromicro Conference on Digital System Design (DSD)","volume":"9 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2022-08-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":"{\"title\":\"TextBack: Watermarking Text Classifiers using Backdooring\",\"authors\":\"Nandish Chattopadhyay, Rajan Kataria, A. Chattopadhyay\",\"doi\":\"10.1109/DSD57027.2022.00053\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"Creating high performance neural networks is ex-pensive, incurring costs that can be attributed to data collection and curation, neural architecture search and training on dedi-cated hardware accelerators. Stakeholders invested in any one or more of these aspects of deep neural network training expect as-surances on ownership and guarantees that unauthorised usage is detectable and therefore preventable. Watermarking the trained neural architectures can prove to be a solution to this. While such techniques have been demonstrated in image classification tasks, we posit that a watermarking scheme can be developed for natural language processing applications as well. In this paper, we propose TextBack, which is a watermarking technique developed for text classifiers using backdooring. We have tested for the functionality preserving properties and verifiable proof of ownership of TextBack on multiple neural architectures and datasets for text classification tasks. The watermarked models consistently generate accuracies within a range of 1 - 2% of models without any watermarking, whilst being reliably verifiable during watermarking verification. TextBack has been tested on two different kinds of Trigger Sets, which can be chosen by the owner as preferred. We have studied the efficiencies of the algorithm that embeds the watermarks by fine tuning using a combination of Trigger samples and clean samples. The benefit of using TextBack's fine tuning approach on pre-trained models from a computational cost perspective against embedding watermarks by training models from scratch is also established experimentally. This watermarking scheme is not computation intensive and adds no additional burden to the neural architecture. This makes TextBack suitable for lightweight applications on edge devices as the watermarked model can be deployed on resource-constrained hardware and SoCs when required.\",\"PeriodicalId\":211723,\"journal\":{\"name\":\"2022 25th Euromicro Conference on Digital System Design (DSD)\",\"volume\":\"9 1\",\"pages\":\"0\"},\"PeriodicalIF\":0.0000,\"publicationDate\":\"2022-08-01\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"0\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"2022 25th Euromicro Conference on Digital System Design (DSD)\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/10.1109/DSD57027.2022.00053\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"2022 25th Euromicro Conference on Digital System Design (DSD)","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/DSD57027.2022.00053","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
TextBack: Watermarking Text Classifiers using Backdooring
Creating high performance neural networks is ex-pensive, incurring costs that can be attributed to data collection and curation, neural architecture search and training on dedi-cated hardware accelerators. Stakeholders invested in any one or more of these aspects of deep neural network training expect as-surances on ownership and guarantees that unauthorised usage is detectable and therefore preventable. Watermarking the trained neural architectures can prove to be a solution to this. While such techniques have been demonstrated in image classification tasks, we posit that a watermarking scheme can be developed for natural language processing applications as well. In this paper, we propose TextBack, which is a watermarking technique developed for text classifiers using backdooring. We have tested for the functionality preserving properties and verifiable proof of ownership of TextBack on multiple neural architectures and datasets for text classification tasks. The watermarked models consistently generate accuracies within a range of 1 - 2% of models without any watermarking, whilst being reliably verifiable during watermarking verification. TextBack has been tested on two different kinds of Trigger Sets, which can be chosen by the owner as preferred. We have studied the efficiencies of the algorithm that embeds the watermarks by fine tuning using a combination of Trigger samples and clean samples. The benefit of using TextBack's fine tuning approach on pre-trained models from a computational cost perspective against embedding watermarks by training models from scratch is also established experimentally. This watermarking scheme is not computation intensive and adds no additional burden to the neural architecture. This makes TextBack suitable for lightweight applications on edge devices as the watermarked model can be deployed on resource-constrained hardware and SoCs when required.