{"title":"SVM与NB机器学习算法在训练数据集上的性能比较研究","authors":"Albin Thomas","doi":"10.33545/26633582.2020.v2.i2a.31","DOIUrl":null,"url":null,"abstract":"Support Vector Machine and Naive Bayes are popular classification algorithms in PDF malware detection, Spam filtering and scientific community training datasets. These algorithms incorporated classifications into the training datasets which they affected with the type of causative and evasion attack. The adversaries are insect the training dataset by injecting malicious sample data. This infected training datasets are used in the ML algorithms without knowing that they are infected for research purpose. Intelligent attackers mislead the SVM and NB learning algorithms functional task by modifying the training dataset. This may cause the security problems in the training dataset. To develop security mechanism, use to cope the attack on training dataset and avoid to decreases ML algorithms performance. This paper shows that the SVM and NB accuracy reduces dramatically when they used infected training dataset. The proposed defence method Rand Check used to prevent the trusted training dataset from causative and evasion attacks.","PeriodicalId":147954,"journal":{"name":"International Journal of Engineering in Computer Science","volume":"189 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2020-07-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":"{\"title\":\"A comparison study of SVM and NB machine learning algorithms performance on training dataset\",\"authors\":\"Albin Thomas\",\"doi\":\"10.33545/26633582.2020.v2.i2a.31\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"Support Vector Machine and Naive Bayes are popular classification algorithms in PDF malware detection, Spam filtering and scientific community training datasets. These algorithms incorporated classifications into the training datasets which they affected with the type of causative and evasion attack. The adversaries are insect the training dataset by injecting malicious sample data. This infected training datasets are used in the ML algorithms without knowing that they are infected for research purpose. Intelligent attackers mislead the SVM and NB learning algorithms functional task by modifying the training dataset. This may cause the security problems in the training dataset. To develop security mechanism, use to cope the attack on training dataset and avoid to decreases ML algorithms performance. This paper shows that the SVM and NB accuracy reduces dramatically when they used infected training dataset. The proposed defence method Rand Check used to prevent the trusted training dataset from causative and evasion attacks.\",\"PeriodicalId\":147954,\"journal\":{\"name\":\"International Journal of Engineering in Computer Science\",\"volume\":\"189 1\",\"pages\":\"0\"},\"PeriodicalIF\":0.0000,\"publicationDate\":\"2020-07-01\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"0\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"International Journal of Engineering in Computer Science\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/10.33545/26633582.2020.v2.i2a.31\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"International Journal of Engineering in Computer Science","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.33545/26633582.2020.v2.i2a.31","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
A comparison study of SVM and NB machine learning algorithms performance on training dataset
Support Vector Machine and Naive Bayes are popular classification algorithms in PDF malware detection, Spam filtering and scientific community training datasets. These algorithms incorporated classifications into the training datasets which they affected with the type of causative and evasion attack. The adversaries are insect the training dataset by injecting malicious sample data. This infected training datasets are used in the ML algorithms without knowing that they are infected for research purpose. Intelligent attackers mislead the SVM and NB learning algorithms functional task by modifying the training dataset. This may cause the security problems in the training dataset. To develop security mechanism, use to cope the attack on training dataset and avoid to decreases ML algorithms performance. This paper shows that the SVM and NB accuracy reduces dramatically when they used infected training dataset. The proposed defence method Rand Check used to prevent the trusted training dataset from causative and evasion attacks.