网络流量数据以ARFF转换为关联规则技术进行数据挖掘

Nattawat Khamphakdee, N. Benjamas, Saiyan Saiyod
{"title":"网络流量数据以ARFF转换为关联规则技术进行数据挖掘","authors":"Nattawat Khamphakdee, N. Benjamas, Saiyan Saiyod","doi":"10.1109/ICOS.2014.7042635","DOIUrl":null,"url":null,"abstract":"Network traffic data is communication data of user on the network. It is a large data and it also consists of normal and abnormal pattern behavior. The analysis and detection of the abnormal pattern behavior in the network traffic data must spend a long time and very hard to find the intrusion pattern. However, the data mining technology can be utilized to extract normal and abnormal pattern behavior. In addition, an association rules technique is one kind of the data mining technology and it be widely utilized to find a pattern. It can discover the events that frequently occur in these data. In order to find the intrusion pattern, the network traffic data must be converted to the special format for the data mining process. In this paper, we propose the network traffic data to ARFF convertor for the association rules technique of the data mining. We developed the software by using Java language and Weka library. In order to evaluate the performance, we utilized the data set of the MIT-DAPRA 1999 in both week 4th and week 5th. Firstly, we wrote the Snort-IDS rules to detect the data set then record the alert data to mysql database. Secondly, the attributes of the header protocol from snort database will be selected such as tcp, icmp and udp protocol, then save the selected data as .csv file format. Thirdly, the .csv file will be converted to .arff file format by utilizing the Weka library. Finally, we used an apriori algorithm of the association rules mining technique to discover relation of itemsets in the data set. As the experimental result, our application can match the pattern that able to discover the frequent itemsets from the data set then it can generate the association rules which are helpful for computer and network administrator to analyze user behavior. In addition, the attribute of our application can be assigned the number of the attribute in the rule. Thus, the generated rules are able to apply with the intrusion detection system.","PeriodicalId":146332,"journal":{"name":"2014 IEEE Conference on Open Systems (ICOS)","volume":"2 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2014-10-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"8","resultStr":"{\"title\":\"Network traffic data to ARFF converter for association rules technique of data mining\",\"authors\":\"Nattawat Khamphakdee, N. Benjamas, Saiyan Saiyod\",\"doi\":\"10.1109/ICOS.2014.7042635\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"Network traffic data is communication data of user on the network. It is a large data and it also consists of normal and abnormal pattern behavior. The analysis and detection of the abnormal pattern behavior in the network traffic data must spend a long time and very hard to find the intrusion pattern. However, the data mining technology can be utilized to extract normal and abnormal pattern behavior. In addition, an association rules technique is one kind of the data mining technology and it be widely utilized to find a pattern. It can discover the events that frequently occur in these data. In order to find the intrusion pattern, the network traffic data must be converted to the special format for the data mining process. In this paper, we propose the network traffic data to ARFF convertor for the association rules technique of the data mining. We developed the software by using Java language and Weka library. In order to evaluate the performance, we utilized the data set of the MIT-DAPRA 1999 in both week 4th and week 5th. Firstly, we wrote the Snort-IDS rules to detect the data set then record the alert data to mysql database. Secondly, the attributes of the header protocol from snort database will be selected such as tcp, icmp and udp protocol, then save the selected data as .csv file format. Thirdly, the .csv file will be converted to .arff file format by utilizing the Weka library. Finally, we used an apriori algorithm of the association rules mining technique to discover relation of itemsets in the data set. As the experimental result, our application can match the pattern that able to discover the frequent itemsets from the data set then it can generate the association rules which are helpful for computer and network administrator to analyze user behavior. In addition, the attribute of our application can be assigned the number of the attribute in the rule. Thus, the generated rules are able to apply with the intrusion detection system.\",\"PeriodicalId\":146332,\"journal\":{\"name\":\"2014 IEEE Conference on Open Systems (ICOS)\",\"volume\":\"2 1\",\"pages\":\"0\"},\"PeriodicalIF\":0.0000,\"publicationDate\":\"2014-10-01\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"8\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"2014 IEEE Conference on Open Systems (ICOS)\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/10.1109/ICOS.2014.7042635\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"2014 IEEE Conference on Open Systems (ICOS)","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/ICOS.2014.7042635","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 8

摘要

网络流量数据是指用户在网络上的通信数据。这是一个大数据,它也包括正常和异常模式的行为。对网络流量数据中异常模式行为的分析和检测必须花费很长时间,并且很难发现入侵模式。然而,数据挖掘技术可以用来提取正常和异常的模式行为。此外,关联规则技术是数据挖掘技术的一种,它被广泛用于发现模式。它可以发现这些数据中经常发生的事件。为了发现入侵模式,必须将网络流量数据转换成特定的格式进行数据挖掘。本文提出了将网络流量数据转换为ARFF进行关联规则技术的数据挖掘。本软件采用Java语言和Weka库进行开发。为了评估性能,我们使用了MIT-DAPRA 1999在第4周和第5周的数据集。首先,我们编写Snort-IDS规则来检测数据集,然后将警报数据记录到mysql数据库。其次,从snort数据库中选择头协议的属性,如tcp、icmp和udp协议,然后将选择的数据保存为.csv文件格式。第三,利用Weka库将。csv文件转换为。arff文件格式。最后,我们使用关联规则挖掘技术的先验算法来发现数据集中项目集之间的关系。实验结果表明,我们的应用程序能够匹配出能够从数据集中发现频繁项集的模式,从而生成关联规则,这有助于计算机和网络管理员分析用户行为。此外,我们的应用程序的属性可以被分配为规则中属性的编号。生成的规则可以应用于入侵检测系统。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
Network traffic data to ARFF converter for association rules technique of data mining
Network traffic data is communication data of user on the network. It is a large data and it also consists of normal and abnormal pattern behavior. The analysis and detection of the abnormal pattern behavior in the network traffic data must spend a long time and very hard to find the intrusion pattern. However, the data mining technology can be utilized to extract normal and abnormal pattern behavior. In addition, an association rules technique is one kind of the data mining technology and it be widely utilized to find a pattern. It can discover the events that frequently occur in these data. In order to find the intrusion pattern, the network traffic data must be converted to the special format for the data mining process. In this paper, we propose the network traffic data to ARFF convertor for the association rules technique of the data mining. We developed the software by using Java language and Weka library. In order to evaluate the performance, we utilized the data set of the MIT-DAPRA 1999 in both week 4th and week 5th. Firstly, we wrote the Snort-IDS rules to detect the data set then record the alert data to mysql database. Secondly, the attributes of the header protocol from snort database will be selected such as tcp, icmp and udp protocol, then save the selected data as .csv file format. Thirdly, the .csv file will be converted to .arff file format by utilizing the Weka library. Finally, we used an apriori algorithm of the association rules mining technique to discover relation of itemsets in the data set. As the experimental result, our application can match the pattern that able to discover the frequent itemsets from the data set then it can generate the association rules which are helpful for computer and network administrator to analyze user behavior. In addition, the attribute of our application can be assigned the number of the attribute in the rule. Thus, the generated rules are able to apply with the intrusion detection system.
求助全文
通过发布文献求助,成功后即可免费获取论文全文。 去求助
来源期刊
自引率
0.00%
发文量
0
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
确定
请完成安全验证×
copy
已复制链接
快去分享给好友吧!
我知道了
右上角分享
点击右上角分享
0
联系我们:info@booksci.cn Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。 Copyright © 2023 布克学术 All rights reserved.
京ICP备2023020795号-1
ghs 京公网安备 11010802042870号
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术官方微信