A comparison of techniques for on-line incremental learning of HMM parameters in anomaly detection

2009 IEEE Symposium on Computational Intelligence for Security and Defense Applications Pub Date : 2009-07-01 DOI:10.1109/CISDA.2009.5356542

Wael Khreich, Eric Granger, A. Miri, R. Sabourin

{"title":"A comparison of techniques for on-line incremental learning of HMM parameters in anomaly detection","authors":"Wael Khreich, Eric Granger, A. Miri, R. Sabourin","doi":"10.1109/CISDA.2009.5356542","DOIUrl":null,"url":null,"abstract":"Hidden Markov Models (HMMs) have been shown to provide a high level performance for detecting anomalies in intrusion detection systems. Since incomplete training data is always employed in practice, and environments being monitored are susceptible to changes, a system for anomaly detection should update its HMM parameters in response to new training data from the environment. Several techniques have been proposed in literature for on-line learning of HMM parameters. However, the theoretical convergence of these algorithms is based on an infinite stream of data for optimal performances. When learning sequences with a finite length, on-line incremental versions of these algorithms can improve discrimination by allowing for convergence over several training iterations. In this paper, the performance of these techniques is compared for learning new sequences of training data in host-based intrusion detection. The discrimination of HMMs trained with different techniques is assessed from data corresponding to sequences of system calls to the operating system kernel. In addition, the resource requirements are assessed through an analysis of time and memory complexity. Results suggest that the techniques for online incremental learning of HMM parameters can provide a higher level of discrimination than those for on-line learning, yet require significantly fewer resources than with batch training. On-line incremental learning techniques may provide a promising solution for adaptive intrusion detection systems.","PeriodicalId":6407,"journal":{"name":"2009 IEEE Symposium on Computational Intelligence for Security and Defense Applications","volume":"4 1","pages":"1-8"},"PeriodicalIF":0.0000,"publicationDate":"2009-07-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"15","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"2009 IEEE Symposium on Computational Intelligence for Security and Defense Applications","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/CISDA.2009.5356542","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}

引用次数: 15

Abstract

Hidden Markov Models (HMMs) have been shown to provide a high level performance for detecting anomalies in intrusion detection systems. Since incomplete training data is always employed in practice, and environments being monitored are susceptible to changes, a system for anomaly detection should update its HMM parameters in response to new training data from the environment. Several techniques have been proposed in literature for on-line learning of HMM parameters. However, the theoretical convergence of these algorithms is based on an infinite stream of data for optimal performances. When learning sequences with a finite length, on-line incremental versions of these algorithms can improve discrimination by allowing for convergence over several training iterations. In this paper, the performance of these techniques is compared for learning new sequences of training data in host-based intrusion detection. The discrimination of HMMs trained with different techniques is assessed from data corresponding to sequences of system calls to the operating system kernel. In addition, the resource requirements are assessed through an analysis of time and memory complexity. Results suggest that the techniques for online incremental learning of HMM parameters can provide a higher level of discrimination than those for on-line learning, yet require significantly fewer resources than with batch training. On-line incremental learning techniques may provide a promising solution for adaptive intrusion detection systems.

查看原文本刊更多论文

异常检测中HMM参数在线增量学习技术的比较

隐马尔可夫模型(hmm)在入侵检测系统中的异常检测方面具有很高的性能。由于在实践中总是使用不完整的训练数据，并且被监测的环境容易发生变化，因此异常检测系统应该根据来自环境的新训练数据更新其HMM参数。文献中提出了几种在线学习HMM参数的技术。然而，这些算法的理论收敛是基于无限数据流的最优性能。当学习有限长度的序列时，这些算法的在线增量版本可以通过允许几个训练迭代的收敛来提高识别能力。在本文中，比较了这些技术在基于主机的入侵检测中学习新训练数据序列的性能。通过对操作系统内核的系统调用序列对应的数据，评估了用不同技术训练的hmm的识别能力。此外，通过对时间和内存复杂性的分析来评估资源需求。结果表明，HMM参数的在线增量学习技术可以提供比在线学习技术更高的识别水平，但所需的资源明显少于批量训练技术。在线增量学习技术为自适应入侵检测系统提供了一种很有前途的解决方案。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

求助全文

约1分钟内获得全文求助全文

来源期刊

2009 IEEE Symposium on Computational Intelligence for Security and Defense Applications

自引率

0.00%

发文量