Majdi Maabreh, A. Maabreh, Basheer Qolomany, A. Al-Fuqaha
{"title":"流行的多类机器学习模型对中毒攻击的鲁棒性:经验教训和见解","authors":"Majdi Maabreh, A. Maabreh, Basheer Qolomany, A. Al-Fuqaha","doi":"10.1177/15501329221105159","DOIUrl":null,"url":null,"abstract":"Despite the encouraging outcomes of machine learning and artificial intelligence applications, the safety of artificial intelligence–based systems is one of the most severe challenges that need further exploration. Data set poisoning is a severe problem that may lead to the corruption of machine learning models. The attacker injects data into the data set that are faulty or mislabeled by flipping the actual labels into the incorrect ones. The word “robustness” refers to a machine learning algorithm’s ability to cope with hostile situations. Here, instead of flipping the labels randomly, we use the clustering approach to choose the training samples for label changes to influence the classifiers’ performance and the distance-based anomaly detection capacity in quarantining the poisoned samples. According to our experiments on a benchmark data set, random label flipping may have a short-term negative impact on the classifier’s accuracy. Yet, an anomaly filter would discover on average 63% of them. On the contrary, the proposed clustering-based flipping might inject dormant poisoned samples until the number of poisoned samples is enough to influence the classifiers’ performance severely; on average, the same anomaly filter would discover 25% of them. We also highlight important lessons and observations during this experiment about the performance and robustness of popular multiclass learners against training data set–poisoning attacks that include: trade-offs, complexity, categories, poisoning resistance, and hyperparameter optimization.","PeriodicalId":50327,"journal":{"name":"International Journal of Distributed Sensor Networks","volume":" ","pages":""},"PeriodicalIF":1.9000,"publicationDate":"2022-07-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"1","resultStr":"{\"title\":\"The robustness of popular multiclass machine learning models against poisoning attacks: Lessons and insights\",\"authors\":\"Majdi Maabreh, A. Maabreh, Basheer Qolomany, A. Al-Fuqaha\",\"doi\":\"10.1177/15501329221105159\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"Despite the encouraging outcomes of machine learning and artificial intelligence applications, the safety of artificial intelligence–based systems is one of the most severe challenges that need further exploration. Data set poisoning is a severe problem that may lead to the corruption of machine learning models. The attacker injects data into the data set that are faulty or mislabeled by flipping the actual labels into the incorrect ones. The word “robustness” refers to a machine learning algorithm’s ability to cope with hostile situations. Here, instead of flipping the labels randomly, we use the clustering approach to choose the training samples for label changes to influence the classifiers’ performance and the distance-based anomaly detection capacity in quarantining the poisoned samples. According to our experiments on a benchmark data set, random label flipping may have a short-term negative impact on the classifier’s accuracy. Yet, an anomaly filter would discover on average 63% of them. On the contrary, the proposed clustering-based flipping might inject dormant poisoned samples until the number of poisoned samples is enough to influence the classifiers’ performance severely; on average, the same anomaly filter would discover 25% of them. We also highlight important lessons and observations during this experiment about the performance and robustness of popular multiclass learners against training data set–poisoning attacks that include: trade-offs, complexity, categories, poisoning resistance, and hyperparameter optimization.\",\"PeriodicalId\":50327,\"journal\":{\"name\":\"International Journal of Distributed Sensor Networks\",\"volume\":\" \",\"pages\":\"\"},\"PeriodicalIF\":1.9000,\"publicationDate\":\"2022-07-01\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"1\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"International Journal of Distributed Sensor Networks\",\"FirstCategoryId\":\"94\",\"ListUrlMain\":\"https://doi.org/10.1177/15501329221105159\",\"RegionNum\":4,\"RegionCategory\":\"计算机科学\",\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"Q3\",\"JCRName\":\"COMPUTER SCIENCE, INFORMATION SYSTEMS\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"International Journal of Distributed Sensor Networks","FirstCategoryId":"94","ListUrlMain":"https://doi.org/10.1177/15501329221105159","RegionNum":4,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q3","JCRName":"COMPUTER SCIENCE, INFORMATION SYSTEMS","Score":null,"Total":0}
The robustness of popular multiclass machine learning models against poisoning attacks: Lessons and insights
Despite the encouraging outcomes of machine learning and artificial intelligence applications, the safety of artificial intelligence–based systems is one of the most severe challenges that need further exploration. Data set poisoning is a severe problem that may lead to the corruption of machine learning models. The attacker injects data into the data set that are faulty or mislabeled by flipping the actual labels into the incorrect ones. The word “robustness” refers to a machine learning algorithm’s ability to cope with hostile situations. Here, instead of flipping the labels randomly, we use the clustering approach to choose the training samples for label changes to influence the classifiers’ performance and the distance-based anomaly detection capacity in quarantining the poisoned samples. According to our experiments on a benchmark data set, random label flipping may have a short-term negative impact on the classifier’s accuracy. Yet, an anomaly filter would discover on average 63% of them. On the contrary, the proposed clustering-based flipping might inject dormant poisoned samples until the number of poisoned samples is enough to influence the classifiers’ performance severely; on average, the same anomaly filter would discover 25% of them. We also highlight important lessons and observations during this experiment about the performance and robustness of popular multiclass learners against training data set–poisoning attacks that include: trade-offs, complexity, categories, poisoning resistance, and hyperparameter optimization.
期刊介绍:
International Journal of Distributed Sensor Networks (IJDSN) is a JCR ranked, peer-reviewed, open access journal that focuses on applied research and applications of sensor networks. The goal of this journal is to provide a forum for the publication of important research contributions in developing high performance computing solutions to problems arising from the complexities of these sensor network systems. Articles highlight advances in uses of sensor network systems for solving computational tasks in manufacturing, engineering and environmental systems.