一种利用有毒数据防御物联网机器学习中毒攻击的方法

Tomoki Chiba, Y. Sei, Yasuyuki Tahara, Akihiko Ohsuga
{"title":"一种利用有毒数据防御物联网机器学习中毒攻击的方法","authors":"Tomoki Chiba, Y. Sei, Yasuyuki Tahara, Akihiko Ohsuga","doi":"10.1109/AIKE48582.2020.00022","DOIUrl":null,"url":null,"abstract":"Machine learning is a technology with the potential to enrich our lives in many ways. It is expected to be used in various situations. However, the value of attacks on machine learning models is also increasing. Therefore, it is considered to be dangerous to use machine learning without proper planning. Poisoning attacks are one of the attacks that can be launched against machine learning models. Poisoning attacks reduce the accuracy of machine learning models by mixing training data with data created with malicious intent to attack the models. Depending on the scenario, the damage caused by poisoning attacks may lead to large-scale accidents. In this study, we propose a method to protect machine learning models from poisoning attacks. In this paper, we assume an environment in which data obtained from multiple sources is used as training data for machine learning models and present a method suitable for defending against poisoning attacks in such an environment. The proposed method computes the influence of the data obtained from each source on the accuracy of the machine learning model to understand how good each source is. The impact of replacing the data from each source with poisonous data is also calculated. Based on the results of these calculations, the proposed method determines the data removal rate for each data source, which represents the confidence level for determining the degree of harmfulness of the data. The proposed method prevents poisonous data from being mixed with the normal data by removing it according to the removal rate. To evaluate the performance of the proposed method, we compared existing methods with the proposed method based on the accuracy of the model after applying the proposed defensive measure. In this experiment, under the condition that the training data contains 17% of poisonous data, the accuracy of the defended model of the proposed method is 89%, which is higher than 83% obtained using the existing method. This shows that the proposed method improved the performance of the model against poisoning attacks.","PeriodicalId":370671,"journal":{"name":"2020 IEEE Third International Conference on Artificial Intelligence and Knowledge Engineering (AIKE)","volume":"32 4 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2020-12-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"1","resultStr":"{\"title\":\"A Defense Method against Poisoning Attacks on IoT Machine Learning Using Poisonous Data\",\"authors\":\"Tomoki Chiba, Y. Sei, Yasuyuki Tahara, Akihiko Ohsuga\",\"doi\":\"10.1109/AIKE48582.2020.00022\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"Machine learning is a technology with the potential to enrich our lives in many ways. It is expected to be used in various situations. However, the value of attacks on machine learning models is also increasing. Therefore, it is considered to be dangerous to use machine learning without proper planning. Poisoning attacks are one of the attacks that can be launched against machine learning models. Poisoning attacks reduce the accuracy of machine learning models by mixing training data with data created with malicious intent to attack the models. Depending on the scenario, the damage caused by poisoning attacks may lead to large-scale accidents. In this study, we propose a method to protect machine learning models from poisoning attacks. In this paper, we assume an environment in which data obtained from multiple sources is used as training data for machine learning models and present a method suitable for defending against poisoning attacks in such an environment. The proposed method computes the influence of the data obtained from each source on the accuracy of the machine learning model to understand how good each source is. The impact of replacing the data from each source with poisonous data is also calculated. Based on the results of these calculations, the proposed method determines the data removal rate for each data source, which represents the confidence level for determining the degree of harmfulness of the data. The proposed method prevents poisonous data from being mixed with the normal data by removing it according to the removal rate. To evaluate the performance of the proposed method, we compared existing methods with the proposed method based on the accuracy of the model after applying the proposed defensive measure. In this experiment, under the condition that the training data contains 17% of poisonous data, the accuracy of the defended model of the proposed method is 89%, which is higher than 83% obtained using the existing method. This shows that the proposed method improved the performance of the model against poisoning attacks.\",\"PeriodicalId\":370671,\"journal\":{\"name\":\"2020 IEEE Third International Conference on Artificial Intelligence and Knowledge Engineering (AIKE)\",\"volume\":\"32 4 1\",\"pages\":\"0\"},\"PeriodicalIF\":0.0000,\"publicationDate\":\"2020-12-01\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"1\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"2020 IEEE Third International Conference on Artificial Intelligence and Knowledge Engineering (AIKE)\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/10.1109/AIKE48582.2020.00022\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"2020 IEEE Third International Conference on Artificial Intelligence and Knowledge Engineering (AIKE)","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/AIKE48582.2020.00022","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 1

摘要

机器学习是一项有潜力在许多方面丰富我们生活的技术。它有望在各种情况下使用。然而,对机器学习模型的攻击也在增加。因此,如果没有适当的计划,使用机器学习被认为是危险的。中毒攻击是针对机器学习模型的攻击之一。中毒攻击通过将训练数据与恶意攻击模型的数据混合在一起,降低了机器学习模型的准确性。根据不同的场景,中毒袭击造成的损害可能导致大规模事故。在这项研究中,我们提出了一种保护机器学习模型免受中毒攻击的方法。在本文中,我们假设从多个来源获得的数据被用作机器学习模型的训练数据的环境,并提出了一种适合在这种环境中防御中毒攻击的方法。该方法计算从每个源获得的数据对机器学习模型精度的影响,以了解每个源的好坏。还计算了用有毒数据替换每个来源的数据的影响。根据这些计算结果,提出的方法确定每个数据源的数据去除率,这代表了确定数据有害程度的置信水平。该方法根据去除率对有毒数据进行去除,防止有毒数据混入正常数据。为了评估所提方法的性能,我们在应用所提防御措施后,基于模型的准确性,将现有方法与所提方法进行了比较。在本实验中,在训练数据含有17%有毒数据的情况下,本文方法的防御模型准确率为89%,高于现有方法的83%。结果表明,该方法提高了模型抗投毒攻击的性能。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
A Defense Method against Poisoning Attacks on IoT Machine Learning Using Poisonous Data
Machine learning is a technology with the potential to enrich our lives in many ways. It is expected to be used in various situations. However, the value of attacks on machine learning models is also increasing. Therefore, it is considered to be dangerous to use machine learning without proper planning. Poisoning attacks are one of the attacks that can be launched against machine learning models. Poisoning attacks reduce the accuracy of machine learning models by mixing training data with data created with malicious intent to attack the models. Depending on the scenario, the damage caused by poisoning attacks may lead to large-scale accidents. In this study, we propose a method to protect machine learning models from poisoning attacks. In this paper, we assume an environment in which data obtained from multiple sources is used as training data for machine learning models and present a method suitable for defending against poisoning attacks in such an environment. The proposed method computes the influence of the data obtained from each source on the accuracy of the machine learning model to understand how good each source is. The impact of replacing the data from each source with poisonous data is also calculated. Based on the results of these calculations, the proposed method determines the data removal rate for each data source, which represents the confidence level for determining the degree of harmfulness of the data. The proposed method prevents poisonous data from being mixed with the normal data by removing it according to the removal rate. To evaluate the performance of the proposed method, we compared existing methods with the proposed method based on the accuracy of the model after applying the proposed defensive measure. In this experiment, under the condition that the training data contains 17% of poisonous data, the accuracy of the defended model of the proposed method is 89%, which is higher than 83% obtained using the existing method. This shows that the proposed method improved the performance of the model against poisoning attacks.
求助全文
通过发布文献求助,成功后即可免费获取论文全文。 去求助
来源期刊
自引率
0.00%
发文量
0
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
确定
请完成安全验证×
copy
已复制链接
快去分享给好友吧!
我知道了
右上角分享
点击右上角分享
0
联系我们:info@booksci.cn Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。 Copyright © 2023 布克学术 All rights reserved.
京ICP备2023020795号-1
ghs 京公网安备 11010802042870号
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术官方微信