A Novel Data-Driven Attack Method on Machine Learning Models

JUCS - Journal of Universal Computer Science Pub Date : 2024-03-28 DOI:10.3897/jucs.108445

Emre Sadıkoğlu, Irfan Kösesoy, Murat Gök

{"title":"A Novel Data-Driven Attack Method on Machine Learning Models","authors":"Emre Sadıkoğlu, Irfan Kösesoy, Murat Gök","doi":"10.3897/jucs.108445","DOIUrl":null,"url":null,"abstract":"With the increasing popularity and usage of artificial intelligence systems, it has become crucial to address their vulnerability to cyber-attacks. In this study, we propose a novel gradient descent-based method to generate fake data that can be accepted as positive by a targeted machine learning model. Our method is designed to generate a large number of positive samples with a minimal number of probes to the model, making it difficult to detect by security systems. Additionally, we develop an alternative model to the attacked model using a reverse engineering approach, trained on a dataset composed of the samples generated by our method. We evaluate the success of our proposed method and the alternative model through a series of experiments. We conducted experiments on six distinct datasets, each of which was trained using three separate machine-learning algorithms. This resulted in a total of eighteen unique models that were evaluated and compared in our analysis. In the evaluation of results, the most commonly used metrics in the literature, including effective attack rate (EAR), accuracy, precision, recall, and F1 score, were employed. Focusing particularly on EAR-oriented assessments, our method demonstrates its effectiveness with a notably high EAR of 97% in the combination of the kNN method and the Cancer dataset. According to the results of our experiments, the proposed method demonstrates high effectiveness as a data-driven attack method.","PeriodicalId":124602,"journal":{"name":"JUCS - Journal of Universal Computer Science","volume":"58 9","pages":""},"PeriodicalIF":0.0000,"publicationDate":"2024-03-28","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"JUCS - Journal of Universal Computer Science","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.3897/jucs.108445","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}

引用次数: 0

Abstract

With the increasing popularity and usage of artificial intelligence systems, it has become crucial to address their vulnerability to cyber-attacks. In this study, we propose a novel gradient descent-based method to generate fake data that can be accepted as positive by a targeted machine learning model. Our method is designed to generate a large number of positive samples with a minimal number of probes to the model, making it difficult to detect by security systems. Additionally, we develop an alternative model to the attacked model using a reverse engineering approach, trained on a dataset composed of the samples generated by our method. We evaluate the success of our proposed method and the alternative model through a series of experiments. We conducted experiments on six distinct datasets, each of which was trained using three separate machine-learning algorithms. This resulted in a total of eighteen unique models that were evaluated and compared in our analysis. In the evaluation of results, the most commonly used metrics in the literature, including effective attack rate (EAR), accuracy, precision, recall, and F1 score, were employed. Focusing particularly on EAR-oriented assessments, our method demonstrates its effectiveness with a notably high EAR of 97% in the combination of the kNN method and the Cancer dataset. According to the results of our experiments, the proposed method demonstrates high effectiveness as a data-driven attack method.

查看原文本刊更多论文

一种新颖的机器学习模型数据驱动攻击方法

随着人工智能系统的日益普及和使用，解决它们易受网络攻击的问题变得至关重要。在本研究中，我们提出了一种新颖的基于梯度下降的方法，用于生成可被目标机器学习模型视为正面的虚假数据。我们的方法旨在以对模型最小的探测次数生成大量正面样本，使其难以被安全系统检测到。此外，我们还使用逆向工程方法开发了一个替代模型，该模型是在由我们的方法生成的样本组成的数据集上训练出来的。我们通过一系列实验来评估我们提出的方法和替代模型是否成功。我们在六个不同的数据集上进行了实验，每个数据集都使用三种不同的机器学习算法进行了训练。因此，在我们的分析中，共有 18 个独特的模型接受了评估和比较。在结果评估中，我们采用了文献中最常用的指标，包括有效攻击率（EAR）、准确率、精确率、召回率和 F1 分数。我们的方法特别注重以有效攻击率为导向的评估，在 kNN 方法和癌症数据集的组合中，我们的方法展示了其有效性，有效攻击率高达 97%。根据我们的实验结果，所提出的方法作为一种数据驱动的攻击方法表现出了很高的有效性。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

求助全文

约1分钟内获得全文求助全文

来源期刊

JUCS - Journal of Universal Computer Science

自引率

0.00%

发文量