An Efficient and Scalable Pattern Matching Scheme for Network Security Applications

Tsern-Huei Lee, Nai-Lun Huang
{"title":"An Efficient and Scalable Pattern Matching Scheme for Network Security Applications","authors":"Tsern-Huei Lee, Nai-Lun Huang","doi":"10.1109/ICCCN.2008.ECP.176","DOIUrl":null,"url":null,"abstract":"Because of its accuracy, pattern matching technique has recently been applied to Internet security applications such as intrusion detection/prevention, anti-virus, and anti-malware. Among various famous pattern matching algorithms, the Aho-Corasick (AC) can match multiple pattern strings simultaneously with worst-case performance guarantee and is adopted in both Clam antivirus (ClamAV) and Snort intrusion detection open sources. The AC algorithm is based on finite automaton which can be implemented straightforwardly with a two-dimensional state transition table. However, the memory requirement prohibits such an implementation when the total length of the pattern strings is large. The ClamAV implementation limits the depth of the finite automaton and combines with linked lists to reduce memory requirement. The banded-row format is adopted to compress the state transition table and used as an alternative pattern matching machine in Snort. In this paper we present a novel implementation which requires small memory space and achieves high throughput performance. Compared with the banded-row format, our proposed scheme achieves 39.7% reduction in memory requirement for 5,000 patterns randomly selected from ClamAV signatures. Besides, the processing time of our proposed scheme is, on the average, 83.9% of that of the banded-row format for scanning various types of files. Compared with the ClamAV implementation with the same 5,000 patterns and files, our proposed scheme requires slightly more memory space but achieves 80.6% reduction in processing time on the average.","PeriodicalId":314071,"journal":{"name":"2008 Proceedings of 17th International Conference on Computer Communications and Networks","volume":"12 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2008-11-17","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"5","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"2008 Proceedings of 17th International Conference on Computer Communications and Networks","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/ICCCN.2008.ECP.176","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 5

Abstract

Because of its accuracy, pattern matching technique has recently been applied to Internet security applications such as intrusion detection/prevention, anti-virus, and anti-malware. Among various famous pattern matching algorithms, the Aho-Corasick (AC) can match multiple pattern strings simultaneously with worst-case performance guarantee and is adopted in both Clam antivirus (ClamAV) and Snort intrusion detection open sources. The AC algorithm is based on finite automaton which can be implemented straightforwardly with a two-dimensional state transition table. However, the memory requirement prohibits such an implementation when the total length of the pattern strings is large. The ClamAV implementation limits the depth of the finite automaton and combines with linked lists to reduce memory requirement. The banded-row format is adopted to compress the state transition table and used as an alternative pattern matching machine in Snort. In this paper we present a novel implementation which requires small memory space and achieves high throughput performance. Compared with the banded-row format, our proposed scheme achieves 39.7% reduction in memory requirement for 5,000 patterns randomly selected from ClamAV signatures. Besides, the processing time of our proposed scheme is, on the average, 83.9% of that of the banded-row format for scanning various types of files. Compared with the ClamAV implementation with the same 5,000 patterns and files, our proposed scheme requires slightly more memory space but achieves 80.6% reduction in processing time on the average.
一种用于网络安全应用的高效可扩展模式匹配方案
由于其准确性,模式匹配技术最近被应用于入侵检测/防御、反病毒和反恶意软件等互联网安全应用中。在众多著名的模式匹配算法中,AC (Aho-Corasick)算法可以同时匹配多个模式字符串,并保证最坏情况下的性能,在Clam antivirus (ClamAV)和Snort入侵检测开源中都采用了AC算法。AC算法基于有限自动机,可以用二维状态转移表直接实现。但是,当模式字符串的总长度很大时,内存需求禁止这样的实现。ClamAV实现限制了有限自动机的深度,并结合链表来减少内存需求。采用带行格式压缩状态转移表,并在Snort中用作备选模式匹配机。在本文中,我们提出了一种新的实现方法,它只需要很小的内存空间,并且可以实现高吞吐量的性能。与带行格式相比,我们提出的方案从ClamAV签名中随机选择5000个模式,内存需求减少了39.7%。此外,在扫描各类文件时,我们提出的方案的平均处理时间是带行格式的83.9%。与具有相同5000个模式和文件的ClamAV实现相比,我们提出的方案需要更多的内存空间,但平均减少了80.6%的处理时间。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
求助全文
约1分钟内获得全文 求助全文
来源期刊
自引率
0.00%
发文量
0
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
确定
请完成安全验证×
copy
已复制链接
快去分享给好友吧!
我知道了
右上角分享
点击右上角分享
0
联系我们:info@booksci.cn Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。 Copyright © 2023 布克学术 All rights reserved.
京ICP备2023020795号-1
ghs 京公网安备 11010802042870号
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术官方微信