研究报告:构建用于安全解析器开发的文件观察站的进展

Tim Allison, Wayne Burke, Dustin Graf, C. Mattmann, Anastasija Mensikova, Michael Milano, Philip Southam, R. Stonebraker
{"title":"研究报告:构建用于安全解析器开发的文件观察站的进展","authors":"Tim Allison, Wayne Burke, Dustin Graf, C. Mattmann, Anastasija Mensikova, Michael Milano, Philip Southam, R. Stonebraker","doi":"10.1109/spw54247.2022.9833875","DOIUrl":null,"url":null,"abstract":"Parsing untrusted data is notoriously challenging. Failure to handle maliciously crafted data correctly can (and does) lead to a wide range of vulnerabilities. The Language-theoretic security (LangSec) philosophy seeks to obviate the need for developers to apply ad hoc solutions by, instead, offering formally correct and verifiable input handling throughout the software development lifecycle. One of the key components in developing secure parsers is a broad coverage corpus that enables developers to understand the problem space for a given format and to use, potentially, as seeds for fuzzing and other automated testing. In this paper, we offer an update on work reported at the LangSec 2021 conference on the development of a file observatory to gather and enable analysis on a diverse collection of files at scale. The initial focus of the observatory is on Portable Document Format (PDF) files and file formats typically embedded in PDFs. In this paper, we report on refactoring the ingest process, applying new analytic methods, and improving the User Interface.","PeriodicalId":334852,"journal":{"name":"2022 IEEE Security and Privacy Workshops (SPW)","volume":"2014 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2022-05-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"1","resultStr":"{\"title\":\"Research Report: Progress on Building a File Observatory for Secure Parser Development\",\"authors\":\"Tim Allison, Wayne Burke, Dustin Graf, C. Mattmann, Anastasija Mensikova, Michael Milano, Philip Southam, R. Stonebraker\",\"doi\":\"10.1109/spw54247.2022.9833875\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"Parsing untrusted data is notoriously challenging. Failure to handle maliciously crafted data correctly can (and does) lead to a wide range of vulnerabilities. The Language-theoretic security (LangSec) philosophy seeks to obviate the need for developers to apply ad hoc solutions by, instead, offering formally correct and verifiable input handling throughout the software development lifecycle. One of the key components in developing secure parsers is a broad coverage corpus that enables developers to understand the problem space for a given format and to use, potentially, as seeds for fuzzing and other automated testing. In this paper, we offer an update on work reported at the LangSec 2021 conference on the development of a file observatory to gather and enable analysis on a diverse collection of files at scale. The initial focus of the observatory is on Portable Document Format (PDF) files and file formats typically embedded in PDFs. In this paper, we report on refactoring the ingest process, applying new analytic methods, and improving the User Interface.\",\"PeriodicalId\":334852,\"journal\":{\"name\":\"2022 IEEE Security and Privacy Workshops (SPW)\",\"volume\":\"2014 1\",\"pages\":\"0\"},\"PeriodicalIF\":0.0000,\"publicationDate\":\"2022-05-01\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"1\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"2022 IEEE Security and Privacy Workshops (SPW)\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/10.1109/spw54247.2022.9833875\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"2022 IEEE Security and Privacy Workshops (SPW)","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/spw54247.2022.9833875","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 1

摘要

解析不可信的数据是非常具有挑战性的。如果不能正确处理恶意制作的数据,可能(而且确实)会导致各种各样的漏洞。语言理论安全(LangSec)哲学旨在通过在整个软件开发生命周期中提供正式正确且可验证的输入处理,来避免开发人员应用特殊解决方案的需要。开发安全解析器的关键组件之一是广泛覆盖的语料库,它使开发人员能够理解给定格式的问题空间,并可能将其用作模糊测试和其他自动化测试的种子。在本文中,我们提供了在LangSec 2021会议上报告的工作更新,该会议讨论了文件观测站的发展,以收集和分析大规模的各种文件集合。天文台最初的工作重点是可移植文件格式(PDF)文件和通常嵌入在PDF中的文件格式。在本文中,我们报告了重构摄取过程,应用新的分析方法,以及改进用户界面。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
Research Report: Progress on Building a File Observatory for Secure Parser Development
Parsing untrusted data is notoriously challenging. Failure to handle maliciously crafted data correctly can (and does) lead to a wide range of vulnerabilities. The Language-theoretic security (LangSec) philosophy seeks to obviate the need for developers to apply ad hoc solutions by, instead, offering formally correct and verifiable input handling throughout the software development lifecycle. One of the key components in developing secure parsers is a broad coverage corpus that enables developers to understand the problem space for a given format and to use, potentially, as seeds for fuzzing and other automated testing. In this paper, we offer an update on work reported at the LangSec 2021 conference on the development of a file observatory to gather and enable analysis on a diverse collection of files at scale. The initial focus of the observatory is on Portable Document Format (PDF) files and file formats typically embedded in PDFs. In this paper, we report on refactoring the ingest process, applying new analytic methods, and improving the User Interface.
求助全文
通过发布文献求助,成功后即可免费获取论文全文。 去求助
来源期刊
自引率
0.00%
发文量
0
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
确定
请完成安全验证×
copy
已复制链接
快去分享给好友吧!
我知道了
右上角分享
点击右上角分享
0
联系我们:info@booksci.cn Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。 Copyright © 2023 布克学术 All rights reserved.
京ICP备2023020795号-1
ghs 京公网安备 11010802042870号
Book学术文献互助
Book学术文献互助群
群 号:604180095
Book学术官方微信