{"title":"使用NSL-KDD数据集的网络安全分析中的机器学习","authors":"Rui-Fong Hong, S. Horng, Shieh-Shing Lin","doi":"10.1109/taai54685.2021.00057","DOIUrl":null,"url":null,"abstract":"Classification is the procedure to recognize, understand, as well as group ideas and objects into given categories. Classification techniques adopt training data patterns to predict the likelihood that subsequent data will classify into one of the given categories. Classification techniques utilize a variety of algorithms to classify future datasets through training data patterns. In current society, many network attacks continue to carry out various types of attacks. This work performs data pre-processing and uses Python with machine learning algorithms to analyze the NSL-KDD data set. We use various machine learning methods, such as decision trees, random forests, Naïve Bayes, KNN, Gradient Boosted Trees, and SVM to analyze the confusion matrix and predict the accuracy. We also draw the ROC curve and the AUC area. We calculate the ACC accuracy and make a simple assessment of the quality of different algorithms. Test results show that through data pre-processing, machine learning algorithms can be performed with extremely high accuracy.","PeriodicalId":343821,"journal":{"name":"2021 International Conference on Technologies and Applications of Artificial Intelligence (TAAI)","volume":"32 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2021-11-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"3","resultStr":"{\"title\":\"Machine Learning in Cyber Security Analytics using NSL-KDD Dataset\",\"authors\":\"Rui-Fong Hong, S. Horng, Shieh-Shing Lin\",\"doi\":\"10.1109/taai54685.2021.00057\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"Classification is the procedure to recognize, understand, as well as group ideas and objects into given categories. Classification techniques adopt training data patterns to predict the likelihood that subsequent data will classify into one of the given categories. Classification techniques utilize a variety of algorithms to classify future datasets through training data patterns. In current society, many network attacks continue to carry out various types of attacks. This work performs data pre-processing and uses Python with machine learning algorithms to analyze the NSL-KDD data set. We use various machine learning methods, such as decision trees, random forests, Naïve Bayes, KNN, Gradient Boosted Trees, and SVM to analyze the confusion matrix and predict the accuracy. We also draw the ROC curve and the AUC area. We calculate the ACC accuracy and make a simple assessment of the quality of different algorithms. Test results show that through data pre-processing, machine learning algorithms can be performed with extremely high accuracy.\",\"PeriodicalId\":343821,\"journal\":{\"name\":\"2021 International Conference on Technologies and Applications of Artificial Intelligence (TAAI)\",\"volume\":\"32 1\",\"pages\":\"0\"},\"PeriodicalIF\":0.0000,\"publicationDate\":\"2021-11-01\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"3\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"2021 International Conference on Technologies and Applications of Artificial Intelligence (TAAI)\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/10.1109/taai54685.2021.00057\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"2021 International Conference on Technologies and Applications of Artificial Intelligence (TAAI)","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/taai54685.2021.00057","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
Machine Learning in Cyber Security Analytics using NSL-KDD Dataset
Classification is the procedure to recognize, understand, as well as group ideas and objects into given categories. Classification techniques adopt training data patterns to predict the likelihood that subsequent data will classify into one of the given categories. Classification techniques utilize a variety of algorithms to classify future datasets through training data patterns. In current society, many network attacks continue to carry out various types of attacks. This work performs data pre-processing and uses Python with machine learning algorithms to analyze the NSL-KDD data set. We use various machine learning methods, such as decision trees, random forests, Naïve Bayes, KNN, Gradient Boosted Trees, and SVM to analyze the confusion matrix and predict the accuracy. We also draw the ROC curve and the AUC area. We calculate the ACC accuracy and make a simple assessment of the quality of different algorithms. Test results show that through data pre-processing, machine learning algorithms can be performed with extremely high accuracy.