{"title":"基于内存硬件故障的训练时间木马攻击的可行性研究","authors":"Kunbei Cai, Zhenkai Zhang, F. Yao","doi":"10.1109/HOST54066.2022.9840266","DOIUrl":null,"url":null,"abstract":"Training-time trojan attacks have been one of the major security threats that can tamper integrity of deep learning models. Existing trojan attacks either require poisoning of the training dataset or depend on control of the training process. In this paper, we investigate the practicality of leveraging hardware-based fault attacks to introduce trojan in deep neural networks (DNNs) at training time. Specifically, we consider a memory-based fault injection using the rowhammer attack vector. We propose a new attack framework where the adversary injects faults to the feature map of DNN models during training. We investigate the impact of bit flips in feature maps and derive a bit flip strategy that enables the victim model to associate a perturbed feature map pattern with a target label without impacting the prediction of normal inputs. We further propose an input trigger identification algorithm that obtains the trigger pattern for the trojaned model at inference time. Our evaluation shows that our attack can trojan DNN models with very high attack success rate. Our work highlights the importance of understanding the impact of hardware-based fault attacks in machine learning training.","PeriodicalId":222250,"journal":{"name":"2022 IEEE International Symposium on Hardware Oriented Security and Trust (HOST)","volume":"13 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2022-06-27","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"5","resultStr":"{\"title\":\"On the Feasibility of Training-time Trojan Attacks through Hardware-based Faults in Memory\",\"authors\":\"Kunbei Cai, Zhenkai Zhang, F. Yao\",\"doi\":\"10.1109/HOST54066.2022.9840266\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"Training-time trojan attacks have been one of the major security threats that can tamper integrity of deep learning models. Existing trojan attacks either require poisoning of the training dataset or depend on control of the training process. In this paper, we investigate the practicality of leveraging hardware-based fault attacks to introduce trojan in deep neural networks (DNNs) at training time. Specifically, we consider a memory-based fault injection using the rowhammer attack vector. We propose a new attack framework where the adversary injects faults to the feature map of DNN models during training. We investigate the impact of bit flips in feature maps and derive a bit flip strategy that enables the victim model to associate a perturbed feature map pattern with a target label without impacting the prediction of normal inputs. We further propose an input trigger identification algorithm that obtains the trigger pattern for the trojaned model at inference time. Our evaluation shows that our attack can trojan DNN models with very high attack success rate. Our work highlights the importance of understanding the impact of hardware-based fault attacks in machine learning training.\",\"PeriodicalId\":222250,\"journal\":{\"name\":\"2022 IEEE International Symposium on Hardware Oriented Security and Trust (HOST)\",\"volume\":\"13 1\",\"pages\":\"0\"},\"PeriodicalIF\":0.0000,\"publicationDate\":\"2022-06-27\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"5\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"2022 IEEE International Symposium on Hardware Oriented Security and Trust (HOST)\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/10.1109/HOST54066.2022.9840266\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"2022 IEEE International Symposium on Hardware Oriented Security and Trust (HOST)","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/HOST54066.2022.9840266","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
On the Feasibility of Training-time Trojan Attacks through Hardware-based Faults in Memory
Training-time trojan attacks have been one of the major security threats that can tamper integrity of deep learning models. Existing trojan attacks either require poisoning of the training dataset or depend on control of the training process. In this paper, we investigate the practicality of leveraging hardware-based fault attacks to introduce trojan in deep neural networks (DNNs) at training time. Specifically, we consider a memory-based fault injection using the rowhammer attack vector. We propose a new attack framework where the adversary injects faults to the feature map of DNN models during training. We investigate the impact of bit flips in feature maps and derive a bit flip strategy that enables the victim model to associate a perturbed feature map pattern with a target label without impacting the prediction of normal inputs. We further propose an input trigger identification algorithm that obtains the trigger pattern for the trojaned model at inference time. Our evaluation shows that our attack can trojan DNN models with very high attack success rate. Our work highlights the importance of understanding the impact of hardware-based fault attacks in machine learning training.