{"title":"Speed-up of Aho-Corasick pattern matching machines by rearranging states","authors":"T. Nishimura, S. Fukamachi, T. Shinohara","doi":"10.1109/SPIRE.2001.989753","DOIUrl":null,"url":null,"abstract":"This article describes speed-up of string pattern matching by rearranging states in Aho-Corasick pattern matching machine, which is a kind of afinite automaton. We realized speed-up of string pattern matching using data compression. Although we obtain higher compression ratio using a finite state model, it doesn't lead speed-up of string pattern matching. Because the pattern matching machine becomes very large, when compression codes are complex. Random Access Memory (RAM) are scattered with states used frequently Such states are close to the initial state of pattern matching machine. We rearrange states so as to collecting states used frequently for CPU cache eficiency. We renumber states in breadth-first order. In experiments, the elapsed time is reduced to about 55% in case of a compressed English text.","PeriodicalId":107511,"journal":{"name":"Proceedings Eighth Symposium on String Processing and Information Retrieval","volume":"71 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"1900-01-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"21","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Proceedings Eighth Symposium on String Processing and Information Retrieval","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/SPIRE.2001.989753","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 21
Abstract
This article describes speed-up of string pattern matching by rearranging states in Aho-Corasick pattern matching machine, which is a kind of afinite automaton. We realized speed-up of string pattern matching using data compression. Although we obtain higher compression ratio using a finite state model, it doesn't lead speed-up of string pattern matching. Because the pattern matching machine becomes very large, when compression codes are complex. Random Access Memory (RAM) are scattered with states used frequently Such states are close to the initial state of pattern matching machine. We rearrange states so as to collecting states used frequently for CPU cache eficiency. We renumber states in breadth-first order. In experiments, the elapsed time is reduced to about 55% in case of a compressed English text.