Arext: Automatic Regular Expression Testing Tool Based on Generating Strings With Full Coverage

N. Hoan, Pham Ngoc Hung
{"title":"Arext: Automatic Regular Expression Testing Tool Based on Generating Strings With Full Coverage","authors":"N. Hoan, Pham Ngoc Hung","doi":"10.1109/KSE53942.2021.9648604","DOIUrl":null,"url":null,"abstract":"This paper introduces a testing tool for regular expressions (regexes) named AREXT. AREXT can automatically extract regexes from C++ source files and visually represent them as DFA graphs. Given a regex, AREXT can generate a set of positive and negative strings with 100% coverage of nodes, edges, and edge pairs. We leverage prior works of synthesizing regexes from natural language to create benchmarks for evaluating AREXT. Some current tools, i.e., EGRET and MUTREX, are also being evaluated and compared. Experiments show that AREXT can outperform EGRET as AREXT can detect more unexpected synthesized regexes in almost all benchmarks. The evaluation results indicate that strings with 100% coverage metrics (generated by AREXT) or strings with maximum mutation score (generated by MUTREX) are not enough to ensure the correctness of regexes under test. Experiments also show that combining AREXT, EGRET, and MUTREX can detect a majority of unwanted synthesized regexes (87–91%).","PeriodicalId":130986,"journal":{"name":"2021 13th International Conference on Knowledge and Systems Engineering (KSE)","volume":"12 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2021-11-10","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"2021 13th International Conference on Knowledge and Systems Engineering (KSE)","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/KSE53942.2021.9648604","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 0

Abstract

This paper introduces a testing tool for regular expressions (regexes) named AREXT. AREXT can automatically extract regexes from C++ source files and visually represent them as DFA graphs. Given a regex, AREXT can generate a set of positive and negative strings with 100% coverage of nodes, edges, and edge pairs. We leverage prior works of synthesizing regexes from natural language to create benchmarks for evaluating AREXT. Some current tools, i.e., EGRET and MUTREX, are also being evaluated and compared. Experiments show that AREXT can outperform EGRET as AREXT can detect more unexpected synthesized regexes in almost all benchmarks. The evaluation results indicate that strings with 100% coverage metrics (generated by AREXT) or strings with maximum mutation score (generated by MUTREX) are not enough to ensure the correctness of regexes under test. Experiments also show that combining AREXT, EGRET, and MUTREX can detect a majority of unwanted synthesized regexes (87–91%).
基于生成全覆盖字符串的自动正则表达式测试工具
本文介绍了一个正则表达式(regexes)的测试工具AREXT。AREXT可以自动从c++源文件中提取正则表达式,并将它们可视化地表示为DFA图。给定一个正则表达式,AREXT可以生成一组正字符串和负字符串,这些字符串100%覆盖节点、边和边对。我们利用先前从自然语言合成正则表达式的工作来创建评估AREXT的基准。目前的一些工具,如EGRET和MUTREX,也正在进行评估和比较。实验表明,在几乎所有的基准测试中,AREXT都可以检测到更多意想不到的合成正则表达式,因此优于EGRET。评估结果表明,100%覆盖率指标的字符串(由AREXT生成)或最大突变分数的字符串(由MUTREX生成)不足以确保被测试正则表达式的正确性。实验还表明,结合AREXT、EGRET和MUTREX可以检测出大多数不需要的合成正则(87% - 91%)。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
求助全文
约1分钟内获得全文 求助全文
来源期刊
自引率
0.00%
发文量
0
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
确定
请完成安全验证×
copy
已复制链接
快去分享给好友吧!
我知道了
右上角分享
点击右上角分享
0
联系我们:info@booksci.cn Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。 Copyright © 2023 布克学术 All rights reserved.
京ICP备2023020795号-1
ghs 京公网安备 11010802042870号
Book学术文献互助
Book学术文献互助群
群 号:604180095
Book学术官方微信