Verbatim++: verified, optimized, and semantically rich lexing with derivatives

D. Egolf, Sam Lasser, Kathleen Fisher
{"title":"Verbatim++: verified, optimized, and semantically rich lexing with derivatives","authors":"D. Egolf, Sam Lasser, Kathleen Fisher","doi":"10.1145/3497775.3503694","DOIUrl":null,"url":null,"abstract":"Lexers and parsers are attractive targets for attackers because they often sit at the boundary between a software system's internals and the outside world. Formally verified lexers can reduce the attack surface of these systems, thus making them more secure. One recent step in this direction is the development of Verbatim, a verified lexer based on the concept of Brzozowski derivatives. Two limitations restrict the tool's usefulness. First, its running time is quadratic in the length of its input string. Second, the lexer produces tokens with a simple \"tag and string\" representation, which limits the tool's ability to integrate with parsers that operate on more expressive token representations. In this work, we present a suite of extensions to Verbatim that overcomes these limitations while preserving the tool's original correctness guarantees. The lexer achieves effectively linear performance on a JSON benchmark through a combination of optimizations that, to our knowledge, has not been previously verified. The enhanced version of Verbatim also enables users to augment their lexical specifications with custom semantic actions, and it uses these actions to produce semantically rich tokens---i.e., tokens that carry values with arbitrary, user-defined types. All extensions were implemented and verified with the Coq Proof Assistant.","PeriodicalId":196529,"journal":{"name":"Proceedings of the 11th ACM SIGPLAN International Conference on Certified Programs and Proofs","volume":"17 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2022-01-17","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"4","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Proceedings of the 11th ACM SIGPLAN International Conference on Certified Programs and Proofs","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1145/3497775.3503694","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 4

Abstract

Lexers and parsers are attractive targets for attackers because they often sit at the boundary between a software system's internals and the outside world. Formally verified lexers can reduce the attack surface of these systems, thus making them more secure. One recent step in this direction is the development of Verbatim, a verified lexer based on the concept of Brzozowski derivatives. Two limitations restrict the tool's usefulness. First, its running time is quadratic in the length of its input string. Second, the lexer produces tokens with a simple "tag and string" representation, which limits the tool's ability to integrate with parsers that operate on more expressive token representations. In this work, we present a suite of extensions to Verbatim that overcomes these limitations while preserving the tool's original correctness guarantees. The lexer achieves effectively linear performance on a JSON benchmark through a combination of optimizations that, to our knowledge, has not been previously verified. The enhanced version of Verbatim also enables users to augment their lexical specifications with custom semantic actions, and it uses these actions to produce semantically rich tokens---i.e., tokens that carry values with arbitrary, user-defined types. All extensions were implemented and verified with the Coq Proof Assistant.
经过验证,优化和语义丰富的词法与衍生
对于攻击者来说,词法分析器和解析器是很有吸引力的目标,因为它们通常位于软件系统内部和外部世界之间的边界。经过正式验证的词法分析器可以减少这些系统的攻击面,从而使它们更加安全。在这个方向上最近的一个步骤是开发了veratim,这是一个基于Brzozowski衍生词概念的验证词法器。有两个限制限制了该工具的实用性。首先,它的运行时间是输入字符串长度的二次函数。其次,词法分析器使用简单的“标记和字符串”表示生成标记,这限制了该工具与处理更具表现力的标记表示的解析器集成的能力。在这项工作中,我们提出了一套对Verbatim的扩展,这些扩展克服了这些限制,同时保留了工具原有的正确性保证。词法分析器通过一系列优化组合在JSON基准上实现了有效的线性性能,据我们所知,这些优化之前还没有经过验证。增强版的Verbatim还允许用户使用自定义语义操作来增强他们的词汇规范,并且它使用这些操作来生成语义丰富的令牌。带有任意用户定义类型的值的令牌。所有扩展都是通过Coq Proof Assistant实现和验证的。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
求助全文
约1分钟内获得全文 求助全文
来源期刊
自引率
0.00%
发文量
0
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
确定
请完成安全验证×
copy
已复制链接
快去分享给好友吧!
我知道了
右上角分享
点击右上角分享
0
联系我们:info@booksci.cn Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。 Copyright © 2023 布克学术 All rights reserved.
京ICP备2023020795号-1
ghs 京公网安备 11010802042870号
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术官方微信