Chain-Based DFA Deflation for Fast and Scalable Regular Expression Matching Using TCAM

2011 ACM/IEEE Seventh Symposium on Architectures for Networking and Communications Systems Pub Date : 2011-10-03 DOI:10.1109/ANCS.2011.13

Kunyang Peng, Siyuan Tang, Min Chen, Qunfeng Dong

{"title":"Chain-Based DFA Deflation for Fast and Scalable Regular Expression Matching Using TCAM","authors":"Kunyang Peng, Siyuan Tang, Min Chen, Qunfeng Dong","doi":"10.1109/ANCS.2011.13","DOIUrl":null,"url":null,"abstract":"Regular expression matching is the core engine of many network functions such as intrusion detection, protocol analysis and so on. In spite of intensive research, we are still in need of a method for fast and scalable regular expression matching, where it takes one simple memory lookup to match each input character (like DFA) and storage space growing linearly with regular expression pattern set size (like NFA). Most recently, TCAM-based DFA implementation has been proposed as a promising approach, for TCAM's unique parallel and wildcard matching capabilities. However, the number of TCAM entries needed is still above exponentially growing DFA size and hence not scalable. In this paper, we propose a chain-based {DFA deflation} method for fast and scalable regular expression matching using TCAM, which takes one simple TCAM lookup to match each input character and effectively deflates DFA size. Experiments based on real life pattern sets demonstrate that, the number of TCAM entries used by our DFA deflation method is up to two orders of magnitude lower than the DFA size, and comes quite close to the linearly growing NFA size. This not only means superior scalability, but also allows us to implement regular expression matching at extremely fast matching speed, up to two orders of magnitude faster than the existing TCAM-based DFA implementation method.","PeriodicalId":124429,"journal":{"name":"2011 ACM/IEEE Seventh Symposium on Architectures for Networking and Communications Systems","volume":"28 18","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2011-10-03","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"45","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"2011 ACM/IEEE Seventh Symposium on Architectures for Networking and Communications Systems","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/ANCS.2011.13","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}

引用次数: 45

Abstract

Regular expression matching is the core engine of many network functions such as intrusion detection, protocol analysis and so on. In spite of intensive research, we are still in need of a method for fast and scalable regular expression matching, where it takes one simple memory lookup to match each input character (like DFA) and storage space growing linearly with regular expression pattern set size (like NFA). Most recently, TCAM-based DFA implementation has been proposed as a promising approach, for TCAM's unique parallel and wildcard matching capabilities. However, the number of TCAM entries needed is still above exponentially growing DFA size and hence not scalable. In this paper, we propose a chain-based {DFA deflation} method for fast and scalable regular expression matching using TCAM, which takes one simple TCAM lookup to match each input character and effectively deflates DFA size. Experiments based on real life pattern sets demonstrate that, the number of TCAM entries used by our DFA deflation method is up to two orders of magnitude lower than the DFA size, and comes quite close to the linearly growing NFA size. This not only means superior scalability, but also allows us to implement regular expression matching at extremely fast matching speed, up to two orders of magnitude faster than the existing TCAM-based DFA implementation method.

查看原文本刊更多论文

基于链的DFA压缩基于TCAM的快速可扩展正则表达式匹配

正则表达式匹配是入侵检测、协议分析等许多网络功能的核心引擎。尽管进行了深入的研究，我们仍然需要一种快速和可扩展的正则表达式匹配方法，其中需要一个简单的内存查找来匹配每个输入字符(如DFA)和存储空间随正则表达式模式集大小线性增长(如NFA)。最近，基于TCAM的DFA实现被认为是一种很有前途的方法，因为TCAM具有独特的并行和通配符匹配能力。然而，所需的TCAM条目数量仍然高于指数级增长的DFA大小，因此无法扩展。在本文中，我们提出了一种基于链的{DFA紧缩}方法，用于使用TCAM进行快速和可扩展的正则表达式匹配，该方法使用一个简单的TCAM查找来匹配每个输入字符并有效地缩小DFA大小。基于现实生活模式集的实验表明，我们的DFA通缩方法使用的TCAM条目数量比DFA大小低两个数量级，并且非常接近线性增长的NFA大小。这不仅意味着优越的可扩展性，而且还允许我们以极快的匹配速度实现正则表达式匹配，比现有的基于tcam的DFA实现方法快两个数量级。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

求助全文

约1分钟内获得全文求助全文

来源期刊

2011 ACM/IEEE Seventh Symposium on Architectures for Networking and Communications Systems

自引率

0.00%

发文量