{"title":"有其师必有其徒:通过基于特征的知识提炼转移后门","authors":"","doi":"10.1016/j.cose.2024.104041","DOIUrl":null,"url":null,"abstract":"<div><p>With the widespread adoption of edge computing, compressing deep neural networks (DNNs) via knowledge distillation (KD) has emerged as a popular technique for resource-limited scenarios. Among various KD methods, feature-based KD, which leverages the feature representations from intermediate layers of the teacher model to supervise the training of the student model, has shown superior performance and enjoyed wide application. However, users often overlook potential backdoor threats when using knowledge distillation (KD) to extract knowledge. To address the issue, this paper mainly contributes to three aspects: (1) we try the first step of exploring the security risks in feature-based KD, where implanted backdoors in teacher models can survive and transfer to student models. (2) We propose a backdoor attack method targeting feature distillation, achieved by encoding backdoor knowledge into specific neuron activation layers. Specifically, we optimize triggers to induce consistent feature map values in the teacher model and transfer the backdoor knowledge to student models through these features. We also design an adaptive defense method against this attack. (3) Extensive experiments on four common datasets and six sets of different teacher and student models validate that our attack outperforms the state-of-the-art (SOTA) baselines, with an average attack success rate of (<span><math><mrow><mo>∼</mo><mo>×</mo><mn>1</mn><mo>.</mo><mn>5</mn></mrow></math></span>). Additionally, we discuss effective defense methods against such backdoor attacks.</p></div>","PeriodicalId":51004,"journal":{"name":"Computers & Security","volume":null,"pages":null},"PeriodicalIF":4.8000,"publicationDate":"2024-08-08","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":"{\"title\":\"Like teacher, like pupil: Transferring backdoors via feature-based knowledge distillation\",\"authors\":\"\",\"doi\":\"10.1016/j.cose.2024.104041\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"<div><p>With the widespread adoption of edge computing, compressing deep neural networks (DNNs) via knowledge distillation (KD) has emerged as a popular technique for resource-limited scenarios. Among various KD methods, feature-based KD, which leverages the feature representations from intermediate layers of the teacher model to supervise the training of the student model, has shown superior performance and enjoyed wide application. However, users often overlook potential backdoor threats when using knowledge distillation (KD) to extract knowledge. To address the issue, this paper mainly contributes to three aspects: (1) we try the first step of exploring the security risks in feature-based KD, where implanted backdoors in teacher models can survive and transfer to student models. (2) We propose a backdoor attack method targeting feature distillation, achieved by encoding backdoor knowledge into specific neuron activation layers. Specifically, we optimize triggers to induce consistent feature map values in the teacher model and transfer the backdoor knowledge to student models through these features. We also design an adaptive defense method against this attack. (3) Extensive experiments on four common datasets and six sets of different teacher and student models validate that our attack outperforms the state-of-the-art (SOTA) baselines, with an average attack success rate of (<span><math><mrow><mo>∼</mo><mo>×</mo><mn>1</mn><mo>.</mo><mn>5</mn></mrow></math></span>). Additionally, we discuss effective defense methods against such backdoor attacks.</p></div>\",\"PeriodicalId\":51004,\"journal\":{\"name\":\"Computers & Security\",\"volume\":null,\"pages\":null},\"PeriodicalIF\":4.8000,\"publicationDate\":\"2024-08-08\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"0\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"Computers & Security\",\"FirstCategoryId\":\"94\",\"ListUrlMain\":\"https://www.sciencedirect.com/science/article/pii/S0167404824003468\",\"RegionNum\":2,\"RegionCategory\":\"计算机科学\",\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"Q1\",\"JCRName\":\"COMPUTER SCIENCE, INFORMATION SYSTEMS\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"Computers & Security","FirstCategoryId":"94","ListUrlMain":"https://www.sciencedirect.com/science/article/pii/S0167404824003468","RegionNum":2,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q1","JCRName":"COMPUTER SCIENCE, INFORMATION SYSTEMS","Score":null,"Total":0}
Like teacher, like pupil: Transferring backdoors via feature-based knowledge distillation
With the widespread adoption of edge computing, compressing deep neural networks (DNNs) via knowledge distillation (KD) has emerged as a popular technique for resource-limited scenarios. Among various KD methods, feature-based KD, which leverages the feature representations from intermediate layers of the teacher model to supervise the training of the student model, has shown superior performance and enjoyed wide application. However, users often overlook potential backdoor threats when using knowledge distillation (KD) to extract knowledge. To address the issue, this paper mainly contributes to three aspects: (1) we try the first step of exploring the security risks in feature-based KD, where implanted backdoors in teacher models can survive and transfer to student models. (2) We propose a backdoor attack method targeting feature distillation, achieved by encoding backdoor knowledge into specific neuron activation layers. Specifically, we optimize triggers to induce consistent feature map values in the teacher model and transfer the backdoor knowledge to student models through these features. We also design an adaptive defense method against this attack. (3) Extensive experiments on four common datasets and six sets of different teacher and student models validate that our attack outperforms the state-of-the-art (SOTA) baselines, with an average attack success rate of (). Additionally, we discuss effective defense methods against such backdoor attacks.
期刊介绍:
Computers & Security is the most respected technical journal in the IT security field. With its high-profile editorial board and informative regular features and columns, the journal is essential reading for IT security professionals around the world.
Computers & Security provides you with a unique blend of leading edge research and sound practical management advice. It is aimed at the professional involved with computer security, audit, control and data integrity in all sectors - industry, commerce and academia. Recognized worldwide as THE primary source of reference for applied research and technical expertise it is your first step to fully secure systems.