基于Spark的并行主机日志分析方法

Xinpeng Li, Yong Wang, Hao Feng, Wenlong Ke
{"title":"基于Spark的并行主机日志分析方法","authors":"Xinpeng Li, Yong Wang, Hao Feng, Wenlong Ke","doi":"10.1109/CIS2018.2018.00073","DOIUrl":null,"url":null,"abstract":"Intrusion detection plays a key role in maintaining the security of computer networks. Host-based intrusion detection systems usually analyze log data to discover host abnormal behavior. In recent years, with the rapid growth of massive host log data generated by virtual machines in the cloud environment, the traditional log analysis methods are limited by factors such as single data source, independent data, large data volume, and insufficient single-point computing capability. To solve this problem, this paper proposes a Spark-based host log data processing method, which first expands the data dimension based on Spark SQL to obtain more detailed dimensional data; then accomplish the query (especially union query) and counting complex data for more comprehensive host health used Spark SQL. Series of experiments result show that our proposed method can achieve platform scalability and has well time performance in log data processing.","PeriodicalId":185099,"journal":{"name":"2018 14th International Conference on Computational Intelligence and Security (CIS)","volume":"141 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2018-11-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"2","resultStr":"{\"title\":\"A Parallel Host Log Analysis Approach Based on Spark\",\"authors\":\"Xinpeng Li, Yong Wang, Hao Feng, Wenlong Ke\",\"doi\":\"10.1109/CIS2018.2018.00073\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"Intrusion detection plays a key role in maintaining the security of computer networks. Host-based intrusion detection systems usually analyze log data to discover host abnormal behavior. In recent years, with the rapid growth of massive host log data generated by virtual machines in the cloud environment, the traditional log analysis methods are limited by factors such as single data source, independent data, large data volume, and insufficient single-point computing capability. To solve this problem, this paper proposes a Spark-based host log data processing method, which first expands the data dimension based on Spark SQL to obtain more detailed dimensional data; then accomplish the query (especially union query) and counting complex data for more comprehensive host health used Spark SQL. Series of experiments result show that our proposed method can achieve platform scalability and has well time performance in log data processing.\",\"PeriodicalId\":185099,\"journal\":{\"name\":\"2018 14th International Conference on Computational Intelligence and Security (CIS)\",\"volume\":\"141 1\",\"pages\":\"0\"},\"PeriodicalIF\":0.0000,\"publicationDate\":\"2018-11-01\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"2\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"2018 14th International Conference on Computational Intelligence and Security (CIS)\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/10.1109/CIS2018.2018.00073\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"2018 14th International Conference on Computational Intelligence and Security (CIS)","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/CIS2018.2018.00073","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 2

摘要

入侵检测在维护计算机网络安全方面起着至关重要的作用。基于主机的入侵检测系统通常通过分析日志数据来发现主机的异常行为。近年来,随着云环境下虚拟机产生的海量主机日志数据的快速增长,传统的日志分析方法受到数据源单一、数据独立、数据量大、单点计算能力不足等因素的限制。针对这一问题,本文提出了一种基于Spark的主机日志数据处理方法,该方法首先基于Spark SQL扩展数据维度,获得更详细的维度数据;然后使用Spark SQL完成查询(特别是联合查询)和对复杂数据的统计,以获得更全面的主机健康状况。一系列实验结果表明,该方法能够实现平台可扩展性,在测井数据处理中具有良好的时效性。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
A Parallel Host Log Analysis Approach Based on Spark
Intrusion detection plays a key role in maintaining the security of computer networks. Host-based intrusion detection systems usually analyze log data to discover host abnormal behavior. In recent years, with the rapid growth of massive host log data generated by virtual machines in the cloud environment, the traditional log analysis methods are limited by factors such as single data source, independent data, large data volume, and insufficient single-point computing capability. To solve this problem, this paper proposes a Spark-based host log data processing method, which first expands the data dimension based on Spark SQL to obtain more detailed dimensional data; then accomplish the query (especially union query) and counting complex data for more comprehensive host health used Spark SQL. Series of experiments result show that our proposed method can achieve platform scalability and has well time performance in log data processing.
求助全文
通过发布文献求助,成功后即可免费获取论文全文。 去求助
来源期刊
自引率
0.00%
发文量
0
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
确定
请完成安全验证×
copy
已复制链接
快去分享给好友吧!
我知道了
右上角分享
点击右上角分享
0
联系我们:info@booksci.cn Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。 Copyright © 2023 布克学术 All rights reserved.
京ICP备2023020795号-1
ghs 京公网安备 11010802042870号
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术官方微信