SVM与NB机器学习算法在训练数据集上的性能比较研究

International Journal of Engineering in Computer Science Pub Date : 2020-07-01 DOI:10.33545/26633582.2020.v2.i2a.31

Albin Thomas

{"title":"SVM与NB机器学习算法在训练数据集上的性能比较研究","authors":"Albin Thomas","doi":"10.33545/26633582.2020.v2.i2a.31","DOIUrl":null,"url":null,"abstract":"Support Vector Machine and Naive Bayes are popular classification algorithms in PDF malware detection, Spam filtering and scientific community training datasets. These algorithms incorporated classifications into the training datasets which they affected with the type of causative and evasion attack. The adversaries are insect the training dataset by injecting malicious sample data. This infected training datasets are used in the ML algorithms without knowing that they are infected for research purpose. Intelligent attackers mislead the SVM and NB learning algorithms functional task by modifying the training dataset. This may cause the security problems in the training dataset. To develop security mechanism, use to cope the attack on training dataset and avoid to decreases ML algorithms performance. This paper shows that the SVM and NB accuracy reduces dramatically when they used infected training dataset. The proposed defence method Rand Check used to prevent the trusted training dataset from causative and evasion attacks.","PeriodicalId":147954,"journal":{"name":"International Journal of Engineering in Computer Science","volume":"189 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2020-07-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":"{\"title\":\"A comparison study of SVM and NB machine learning algorithms performance on training dataset\",\"authors\":\"Albin Thomas\",\"doi\":\"10.33545/26633582.2020.v2.i2a.31\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"Support Vector Machine and Naive Bayes are popular classification algorithms in PDF malware detection, Spam filtering and scientific community training datasets. These algorithms incorporated classifications into the training datasets which they affected with the type of causative and evasion attack. The adversaries are insect the training dataset by injecting malicious sample data. This infected training datasets are used in the ML algorithms without knowing that they are infected for research purpose. Intelligent attackers mislead the SVM and NB learning algorithms functional task by modifying the training dataset. This may cause the security problems in the training dataset. To develop security mechanism, use to cope the attack on training dataset and avoid to decreases ML algorithms performance. This paper shows that the SVM and NB accuracy reduces dramatically when they used infected training dataset. The proposed defence method Rand Check used to prevent the trusted training dataset from causative and evasion attacks.\",\"PeriodicalId\":147954,\"journal\":{\"name\":\"International Journal of Engineering in Computer Science\",\"volume\":\"189 1\",\"pages\":\"0\"},\"PeriodicalIF\":0.0000,\"publicationDate\":\"2020-07-01\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"0\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"International Journal of Engineering in Computer Science\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/10.33545/26633582.2020.v2.i2a.31\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"International Journal of Engineering in Computer Science","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.33545/26633582.2020.v2.i2a.31","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}

引用次数: 0

摘要

支持向量机和朴素贝叶斯是PDF恶意软件检测、垃圾邮件过滤和科学社区训练数据集中常用的分类算法。这些算法将分类合并到训练数据集中，这些数据集受到因果攻击和逃避攻击类型的影响。攻击者通过注入恶意样本数据来入侵训练数据集。这些被感染的训练数据集用于机器学习算法，而不知道它们被感染用于研究目的。智能攻击者通过修改训练数据集来误导SVM和NB学习算法的功能任务。这可能会导致训练数据集的安全问题。开发安全机制，用于应对对训练数据集的攻击，避免降低机器学习算法的性能。本文表明，SVM和NB在使用受感染的训练数据集时，准确率显著下降。提出的防御方法Rand Check用于防止可信训练数据集受到因果攻击和逃避攻击。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

查看原文本刊更多论文

A comparison study of SVM and NB machine learning algorithms performance on training dataset

Support Vector Machine and Naive Bayes are popular classification algorithms in PDF malware detection, Spam filtering and scientific community training datasets. These algorithms incorporated classifications into the training datasets which they affected with the type of causative and evasion attack. The adversaries are insect the training dataset by injecting malicious sample data. This infected training datasets are used in the ML algorithms without knowing that they are infected for research purpose. Intelligent attackers mislead the SVM and NB learning algorithms functional task by modifying the training dataset. This may cause the security problems in the training dataset. To develop security mechanism, use to cope the attack on training dataset and avoid to decreases ML algorithms performance. This paper shows that the SVM and NB accuracy reduces dramatically when they used infected training dataset. The proposed defence method Rand Check used to prevent the trusted training dataset from causative and evasion attacks.

求助全文

通过发布文献求助，成功后即可免费获取论文全文。去求助

来源期刊

International Journal of Engineering in Computer Science

自引率

0.00%

发文量