Brink van der Merwe, Nicolaas Weideman, Martin Berglund
{"title":"Turning evil regexes harmless","authors":"Brink van der Merwe, Nicolaas Weideman, Martin Berglund","doi":"10.1145/3129416.3129440","DOIUrl":null,"url":null,"abstract":"We explore the relationship between ambiguity in automata and regular expressions on the one hand, and the matching time of backtracking regular expression matchers on the other. We focus in particular on the extreme cases where we have either an exponential amount of ambiguity or no ambiguity at all. We also investigate techniques to reduce or remove ambiguity from regular expressions, which can then be used to transform regular expressions which might be exploited by using algorithmic complexity, into harmless equivalent expressions.","PeriodicalId":269578,"journal":{"name":"Research Conference of the South African Institute of Computer Scientists and Information Technologists","volume":"110 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2017-09-26","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"19","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Research Conference of the South African Institute of Computer Scientists and Information Technologists","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1145/3129416.3129440","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 19
Abstract
We explore the relationship between ambiguity in automata and regular expressions on the one hand, and the matching time of backtracking regular expression matchers on the other. We focus in particular on the extreme cases where we have either an exponential amount of ambiguity or no ambiguity at all. We also investigate techniques to reduce or remove ambiguity from regular expressions, which can then be used to transform regular expressions which might be exploited by using algorithmic complexity, into harmless equivalent expressions.