{"title":"The Pitfalls of Protocol Design: Attempting to Write a Formally Verified PDF Parser","authors":"Andreas Bogk, Marco Schopl","doi":"10.1109/SPW.2014.36","DOIUrl":null,"url":null,"abstract":"Parsers for complex data formats generally present a big attack surface for input-driven exploitation. In practice, this has been especially true for implementations of the PDF data format, as witnessed by dozens of known vulnerabilities exploited in many real world attacks, with the Acrobat Reader implementation being the main target. In this report, we describe our attempts to use Coq, a theorem prover based on a functional programming language making use of dependent types and the Curry-Howard isomorphism, to implement a formally verified PDF parser. We ended up implementing a subset of the PDF format and proving termination of the combinator-based parser. Noteworthy results include a dependent type representing a list of strictly monotonically decreasing length of remaining symbols to parse, which allowed us to show termination of parser combinators. Also, difficulties showing termination of parsing some features of the PDF format readily translated into denial of service attacks against existing PDF parsers-we came up with a single PDF file that made all the existing PDF implementations we could test enter an endless loop.","PeriodicalId":142224,"journal":{"name":"2014 IEEE Security and Privacy Workshops","volume":"1 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2014-05-17","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"5","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"2014 IEEE Security and Privacy Workshops","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/SPW.2014.36","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 5
Abstract
Parsers for complex data formats generally present a big attack surface for input-driven exploitation. In practice, this has been especially true for implementations of the PDF data format, as witnessed by dozens of known vulnerabilities exploited in many real world attacks, with the Acrobat Reader implementation being the main target. In this report, we describe our attempts to use Coq, a theorem prover based on a functional programming language making use of dependent types and the Curry-Howard isomorphism, to implement a formally verified PDF parser. We ended up implementing a subset of the PDF format and proving termination of the combinator-based parser. Noteworthy results include a dependent type representing a list of strictly monotonically decreasing length of remaining symbols to parse, which allowed us to show termination of parser combinators. Also, difficulties showing termination of parsing some features of the PDF format readily translated into denial of service attacks against existing PDF parsers-we came up with a single PDF file that made all the existing PDF implementations we could test enter an endless loop.