Tomoki Chiba, Y. Sei, Yasuyuki Tahara, Akihiko Ohsuga
{"title":"一种利用有毒数据防御物联网机器学习中毒攻击的方法","authors":"Tomoki Chiba, Y. Sei, Yasuyuki Tahara, Akihiko Ohsuga","doi":"10.1109/AIKE48582.2020.00022","DOIUrl":null,"url":null,"abstract":"Machine learning is a technology with the potential to enrich our lives in many ways. It is expected to be used in various situations. However, the value of attacks on machine learning models is also increasing. Therefore, it is considered to be dangerous to use machine learning without proper planning. Poisoning attacks are one of the attacks that can be launched against machine learning models. Poisoning attacks reduce the accuracy of machine learning models by mixing training data with data created with malicious intent to attack the models. Depending on the scenario, the damage caused by poisoning attacks may lead to large-scale accidents. In this study, we propose a method to protect machine learning models from poisoning attacks. In this paper, we assume an environment in which data obtained from multiple sources is used as training data for machine learning models and present a method suitable for defending against poisoning attacks in such an environment. The proposed method computes the influence of the data obtained from each source on the accuracy of the machine learning model to understand how good each source is. The impact of replacing the data from each source with poisonous data is also calculated. Based on the results of these calculations, the proposed method determines the data removal rate for each data source, which represents the confidence level for determining the degree of harmfulness of the data. The proposed method prevents poisonous data from being mixed with the normal data by removing it according to the removal rate. To evaluate the performance of the proposed method, we compared existing methods with the proposed method based on the accuracy of the model after applying the proposed defensive measure. In this experiment, under the condition that the training data contains 17% of poisonous data, the accuracy of the defended model of the proposed method is 89%, which is higher than 83% obtained using the existing method. This shows that the proposed method improved the performance of the model against poisoning attacks.","PeriodicalId":370671,"journal":{"name":"2020 IEEE Third International Conference on Artificial Intelligence and Knowledge Engineering (AIKE)","volume":"32 4 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2020-12-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"1","resultStr":"{\"title\":\"A Defense Method against Poisoning Attacks on IoT Machine Learning Using Poisonous Data\",\"authors\":\"Tomoki Chiba, Y. Sei, Yasuyuki Tahara, Akihiko Ohsuga\",\"doi\":\"10.1109/AIKE48582.2020.00022\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"Machine learning is a technology with the potential to enrich our lives in many ways. It is expected to be used in various situations. However, the value of attacks on machine learning models is also increasing. Therefore, it is considered to be dangerous to use machine learning without proper planning. Poisoning attacks are one of the attacks that can be launched against machine learning models. Poisoning attacks reduce the accuracy of machine learning models by mixing training data with data created with malicious intent to attack the models. Depending on the scenario, the damage caused by poisoning attacks may lead to large-scale accidents. In this study, we propose a method to protect machine learning models from poisoning attacks. In this paper, we assume an environment in which data obtained from multiple sources is used as training data for machine learning models and present a method suitable for defending against poisoning attacks in such an environment. The proposed method computes the influence of the data obtained from each source on the accuracy of the machine learning model to understand how good each source is. The impact of replacing the data from each source with poisonous data is also calculated. Based on the results of these calculations, the proposed method determines the data removal rate for each data source, which represents the confidence level for determining the degree of harmfulness of the data. The proposed method prevents poisonous data from being mixed with the normal data by removing it according to the removal rate. To evaluate the performance of the proposed method, we compared existing methods with the proposed method based on the accuracy of the model after applying the proposed defensive measure. In this experiment, under the condition that the training data contains 17% of poisonous data, the accuracy of the defended model of the proposed method is 89%, which is higher than 83% obtained using the existing method. This shows that the proposed method improved the performance of the model against poisoning attacks.\",\"PeriodicalId\":370671,\"journal\":{\"name\":\"2020 IEEE Third International Conference on Artificial Intelligence and Knowledge Engineering (AIKE)\",\"volume\":\"32 4 1\",\"pages\":\"0\"},\"PeriodicalIF\":0.0000,\"publicationDate\":\"2020-12-01\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"1\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"2020 IEEE Third International Conference on Artificial Intelligence and Knowledge Engineering (AIKE)\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/10.1109/AIKE48582.2020.00022\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"2020 IEEE Third International Conference on Artificial Intelligence and Knowledge Engineering (AIKE)","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/AIKE48582.2020.00022","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
A Defense Method against Poisoning Attacks on IoT Machine Learning Using Poisonous Data
Machine learning is a technology with the potential to enrich our lives in many ways. It is expected to be used in various situations. However, the value of attacks on machine learning models is also increasing. Therefore, it is considered to be dangerous to use machine learning without proper planning. Poisoning attacks are one of the attacks that can be launched against machine learning models. Poisoning attacks reduce the accuracy of machine learning models by mixing training data with data created with malicious intent to attack the models. Depending on the scenario, the damage caused by poisoning attacks may lead to large-scale accidents. In this study, we propose a method to protect machine learning models from poisoning attacks. In this paper, we assume an environment in which data obtained from multiple sources is used as training data for machine learning models and present a method suitable for defending against poisoning attacks in such an environment. The proposed method computes the influence of the data obtained from each source on the accuracy of the machine learning model to understand how good each source is. The impact of replacing the data from each source with poisonous data is also calculated. Based on the results of these calculations, the proposed method determines the data removal rate for each data source, which represents the confidence level for determining the degree of harmfulness of the data. The proposed method prevents poisonous data from being mixed with the normal data by removing it according to the removal rate. To evaluate the performance of the proposed method, we compared existing methods with the proposed method based on the accuracy of the model after applying the proposed defensive measure. In this experiment, under the condition that the training data contains 17% of poisonous data, the accuracy of the defended model of the proposed method is 89%, which is higher than 83% obtained using the existing method. This shows that the proposed method improved the performance of the model against poisoning attacks.