{"title":"Offset-FA:分离闭包和计数以实现有效的正则表达式匹配","authors":"Chengcheng Xu, Jinshu Su, Shuhui Chen, Biao Han","doi":"10.1109/SC2.2017.50","DOIUrl":null,"url":null,"abstract":"Fast regular expression matching (REM) is the core issue in deep packet inspection (DPI). Traditional REM mainly relies on deterministic finite automaton (DFA) to achieve fast matching. However, state explosion usually makes the DFA infeasible in practice. We propose the offset-FA to solve the state explosion problem in REM. The state explosion is mainly caused by the features of the large character set with closures or counting repetitions. We extract these features from original patterns, and represent them as an offset relation table and a reset table to keep semantic equivalence, and the rest fragments are compiled to a DFA called fragment-DFA. The fragment-DFA along with the offset relation table and reset table compose our Offset-FA. Experiments show that the offset-FA supports large rule sets and outperforms state-of-the-art solutions in space cost and matching speed.","PeriodicalId":188326,"journal":{"name":"2017 IEEE 7th International Symposium on Cloud and Service Computing (SC2)","volume":"495 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2017-11-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"2","resultStr":"{\"title\":\"Offset-FA: Detach the Closures and Countings for Efficient Regular Expression Matching\",\"authors\":\"Chengcheng Xu, Jinshu Su, Shuhui Chen, Biao Han\",\"doi\":\"10.1109/SC2.2017.50\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"Fast regular expression matching (REM) is the core issue in deep packet inspection (DPI). Traditional REM mainly relies on deterministic finite automaton (DFA) to achieve fast matching. However, state explosion usually makes the DFA infeasible in practice. We propose the offset-FA to solve the state explosion problem in REM. The state explosion is mainly caused by the features of the large character set with closures or counting repetitions. We extract these features from original patterns, and represent them as an offset relation table and a reset table to keep semantic equivalence, and the rest fragments are compiled to a DFA called fragment-DFA. The fragment-DFA along with the offset relation table and reset table compose our Offset-FA. Experiments show that the offset-FA supports large rule sets and outperforms state-of-the-art solutions in space cost and matching speed.\",\"PeriodicalId\":188326,\"journal\":{\"name\":\"2017 IEEE 7th International Symposium on Cloud and Service Computing (SC2)\",\"volume\":\"495 1\",\"pages\":\"0\"},\"PeriodicalIF\":0.0000,\"publicationDate\":\"2017-11-01\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"2\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"2017 IEEE 7th International Symposium on Cloud and Service Computing (SC2)\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/10.1109/SC2.2017.50\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"2017 IEEE 7th International Symposium on Cloud and Service Computing (SC2)","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/SC2.2017.50","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
Offset-FA: Detach the Closures and Countings for Efficient Regular Expression Matching
Fast regular expression matching (REM) is the core issue in deep packet inspection (DPI). Traditional REM mainly relies on deterministic finite automaton (DFA) to achieve fast matching. However, state explosion usually makes the DFA infeasible in practice. We propose the offset-FA to solve the state explosion problem in REM. The state explosion is mainly caused by the features of the large character set with closures or counting repetitions. We extract these features from original patterns, and represent them as an offset relation table and a reset table to keep semantic equivalence, and the rest fragments are compiled to a DFA called fragment-DFA. The fragment-DFA along with the offset relation table and reset table compose our Offset-FA. Experiments show that the offset-FA supports large rule sets and outperforms state-of-the-art solutions in space cost and matching speed.