{"title":"Fast and scalable pattern matching for content filtering","authors":"Sarang Dharmapurikar, J. Lockwood","doi":"10.1145/1095890.1095916","DOIUrl":null,"url":null,"abstract":"High-speed packet content inspection and filtering devices rely on a fast multi-pattern matching algorithm which is used to detect predefined keywords or signatures in the packets. Multi-pattern matching is known to require intensive memory accesses and is often a performance bottleneck. Hence specialized hardware-accelerated algorithms are being developed for line-speed packet processing. While several pattern matching algorithms have already been developed for such applications, we find that most of them suffer from scalability issues. To support a large number of patterns, the throughput is compromised or vice versa. We present a hardware-implementable pattern matching algorithm for content filtering applications, which is scalable in terms of speed, the number of patterns and the pattern length. We modify the classic Aho-Corasick algorithm to consider multiple characters at a time for higher throughput. Furthermore, we suppress a large fraction of memory accesses by using Bloom filters implemented with a small amount of on-chip memory. The resulting algorithm can support matching of several thousands of patterns at more than 10 Gbps with the help of a less than 50 KBytes of embedded memory and a few megabytes of external SRAM. We demonstrate the merit of our algorithm through theoretical analysis and simulations performed on Snort's string set.","PeriodicalId":417086,"journal":{"name":"2005 Symposium on Architectures for Networking and Communications Systems (ANCS)","volume":"24 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2005-10-26","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"93","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"2005 Symposium on Architectures for Networking and Communications Systems (ANCS)","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1145/1095890.1095916","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 93
Abstract
High-speed packet content inspection and filtering devices rely on a fast multi-pattern matching algorithm which is used to detect predefined keywords or signatures in the packets. Multi-pattern matching is known to require intensive memory accesses and is often a performance bottleneck. Hence specialized hardware-accelerated algorithms are being developed for line-speed packet processing. While several pattern matching algorithms have already been developed for such applications, we find that most of them suffer from scalability issues. To support a large number of patterns, the throughput is compromised or vice versa. We present a hardware-implementable pattern matching algorithm for content filtering applications, which is scalable in terms of speed, the number of patterns and the pattern length. We modify the classic Aho-Corasick algorithm to consider multiple characters at a time for higher throughput. Furthermore, we suppress a large fraction of memory accesses by using Bloom filters implemented with a small amount of on-chip memory. The resulting algorithm can support matching of several thousands of patterns at more than 10 Gbps with the help of a less than 50 KBytes of embedded memory and a few megabytes of external SRAM. We demonstrate the merit of our algorithm through theoretical analysis and simulations performed on Snort's string set.