Yesta Medya Mahardhika, Amang Sudarsono, Ali Ridho Barakbah
{"title":"基于网络流模型的僵尸网络数据集预测精度的实现","authors":"Yesta Medya Mahardhika, Amang Sudarsono, Ali Ridho Barakbah","doi":"10.1109/KCIC.2017.8228455","DOIUrl":null,"url":null,"abstract":"Botnet is a malicious software that can perform malicious activities, such as (Distributed Denial of Services) DDoS, spamming, phishing, key logging, click fraud, steal personal information and important data, etc. Botnets can replicate themselves without user consent. Several systems of botnet detection have been done by using a machine learning method with feature selection approach. Currently, the creation of dataset feature based on network flow, Domain Name System (DNS) traffic and content based that represent botnet behavior. Unfortunately the dataset for botnet detection is dummy dataset, to implement in machine learning needs extractor tool which is very expensive to buy. Therefore we create our own features extractor. In this paper we propose network flow using connection logs approach on the dataset. First of all we made the data model using pair of source IP (Internet Protocol), destination IP and source port, destination port in a period time to extract new features. To predict the accuracy, the extracted features will be validated using K-Fold Cross Validation with number of k= 10. The results of the validation with six various types of botnet shows the high Precision=98.70%, F-Measure=99.40%, Recall=98.80%, and Accuracy=98.80% for Rule Induction algorithm, while K-Nearest Neighbor is the most stable than all algorithms that achieve precision, Recall, F-measure and accuracy to 98.10% and high speed (50 ms).","PeriodicalId":117148,"journal":{"name":"2017 International Electronics Symposium on Knowledge Creation and Intelligent Computing (IES-KCIC)","volume":"133 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2017-09-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"5","resultStr":"{\"title\":\"An implementation of Botnet dataset to predict accuracy based on network flow model\",\"authors\":\"Yesta Medya Mahardhika, Amang Sudarsono, Ali Ridho Barakbah\",\"doi\":\"10.1109/KCIC.2017.8228455\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"Botnet is a malicious software that can perform malicious activities, such as (Distributed Denial of Services) DDoS, spamming, phishing, key logging, click fraud, steal personal information and important data, etc. Botnets can replicate themselves without user consent. Several systems of botnet detection have been done by using a machine learning method with feature selection approach. Currently, the creation of dataset feature based on network flow, Domain Name System (DNS) traffic and content based that represent botnet behavior. Unfortunately the dataset for botnet detection is dummy dataset, to implement in machine learning needs extractor tool which is very expensive to buy. Therefore we create our own features extractor. In this paper we propose network flow using connection logs approach on the dataset. First of all we made the data model using pair of source IP (Internet Protocol), destination IP and source port, destination port in a period time to extract new features. To predict the accuracy, the extracted features will be validated using K-Fold Cross Validation with number of k= 10. The results of the validation with six various types of botnet shows the high Precision=98.70%, F-Measure=99.40%, Recall=98.80%, and Accuracy=98.80% for Rule Induction algorithm, while K-Nearest Neighbor is the most stable than all algorithms that achieve precision, Recall, F-measure and accuracy to 98.10% and high speed (50 ms).\",\"PeriodicalId\":117148,\"journal\":{\"name\":\"2017 International Electronics Symposium on Knowledge Creation and Intelligent Computing (IES-KCIC)\",\"volume\":\"133 1\",\"pages\":\"0\"},\"PeriodicalIF\":0.0000,\"publicationDate\":\"2017-09-01\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"5\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"2017 International Electronics Symposium on Knowledge Creation and Intelligent Computing (IES-KCIC)\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/10.1109/KCIC.2017.8228455\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"2017 International Electronics Symposium on Knowledge Creation and Intelligent Computing (IES-KCIC)","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/KCIC.2017.8228455","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
An implementation of Botnet dataset to predict accuracy based on network flow model
Botnet is a malicious software that can perform malicious activities, such as (Distributed Denial of Services) DDoS, spamming, phishing, key logging, click fraud, steal personal information and important data, etc. Botnets can replicate themselves without user consent. Several systems of botnet detection have been done by using a machine learning method with feature selection approach. Currently, the creation of dataset feature based on network flow, Domain Name System (DNS) traffic and content based that represent botnet behavior. Unfortunately the dataset for botnet detection is dummy dataset, to implement in machine learning needs extractor tool which is very expensive to buy. Therefore we create our own features extractor. In this paper we propose network flow using connection logs approach on the dataset. First of all we made the data model using pair of source IP (Internet Protocol), destination IP and source port, destination port in a period time to extract new features. To predict the accuracy, the extracted features will be validated using K-Fold Cross Validation with number of k= 10. The results of the validation with six various types of botnet shows the high Precision=98.70%, F-Measure=99.40%, Recall=98.80%, and Accuracy=98.80% for Rule Induction algorithm, while K-Nearest Neighbor is the most stable than all algorithms that achieve precision, Recall, F-measure and accuracy to 98.10% and high speed (50 ms).