二进制消息格式的语义驱动的字段推断方法

Jared Chandler, Adam Wick, Kathleen Fisher
{"title":"二进制消息格式的语义驱动的字段推断方法","authors":"Jared Chandler, Adam Wick, Kathleen Fisher","doi":"10.14722/ndss.2023.23131","DOIUrl":null,"url":null,"abstract":"—We present B inary I nferno , a fully automatic tool for reverse engineering binary message formats. Given a set of mes- sages with the same format, the tool uses an ensemble of detectors to infer a collection of partial descriptions and then automatically integrates the partial descriptions into a semantically-meaningful description that can be used to parse future packets with the same format. As its ensemble, B inary I nferno uses a modular and extensible set of targeted detectors, including detectors for identifying atomic data types such as IEEE floats, timestamps, and integer length fields; for finding boundaries between adjacent fields using Shannon entropy; and for discovering variable-length sequences by searching for common serialization idioms. We evaluate B inary I nferno ’s performance on sets of packets drawn from 10 binary protocols. Our semantic-driven approach significantly decreases false positive rates and increases precision when compared to the previous state of the art. For top-level protocols we identify field boundaries with an average precision of 0.69, an average recall of 0.73, and an average false positive rate of 0.04, significantly outperforming five other state-of-the-art protocol reverse engineering tools on the same data sets: A wre (0.18, 0.03, 0.04), F ield H unter (0.68, 0.37, 0.01), N emesys (0.31, 0.44, 0.11), N etplier (0.29, 0.75, 0.22), and N etzob (0.57, 0.42, 0.03). We believe our improvements in precision and false positive rates represent what our target user most wants: semantically meaningful descriptions with fewer false positives.","PeriodicalId":199733,"journal":{"name":"Proceedings 2023 Network and Distributed System Security Symposium","volume":"42 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"1900-01-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"2","resultStr":"{\"title\":\"BinaryInferno: A Semantic-Driven Approach to Field Inference for Binary Message Formats\",\"authors\":\"Jared Chandler, Adam Wick, Kathleen Fisher\",\"doi\":\"10.14722/ndss.2023.23131\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"—We present B inary I nferno , a fully automatic tool for reverse engineering binary message formats. Given a set of mes- sages with the same format, the tool uses an ensemble of detectors to infer a collection of partial descriptions and then automatically integrates the partial descriptions into a semantically-meaningful description that can be used to parse future packets with the same format. As its ensemble, B inary I nferno uses a modular and extensible set of targeted detectors, including detectors for identifying atomic data types such as IEEE floats, timestamps, and integer length fields; for finding boundaries between adjacent fields using Shannon entropy; and for discovering variable-length sequences by searching for common serialization idioms. We evaluate B inary I nferno ’s performance on sets of packets drawn from 10 binary protocols. Our semantic-driven approach significantly decreases false positive rates and increases precision when compared to the previous state of the art. For top-level protocols we identify field boundaries with an average precision of 0.69, an average recall of 0.73, and an average false positive rate of 0.04, significantly outperforming five other state-of-the-art protocol reverse engineering tools on the same data sets: A wre (0.18, 0.03, 0.04), F ield H unter (0.68, 0.37, 0.01), N emesys (0.31, 0.44, 0.11), N etplier (0.29, 0.75, 0.22), and N etzob (0.57, 0.42, 0.03). We believe our improvements in precision and false positive rates represent what our target user most wants: semantically meaningful descriptions with fewer false positives.\",\"PeriodicalId\":199733,\"journal\":{\"name\":\"Proceedings 2023 Network and Distributed System Security Symposium\",\"volume\":\"42 1\",\"pages\":\"0\"},\"PeriodicalIF\":0.0000,\"publicationDate\":\"1900-01-01\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"2\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"Proceedings 2023 Network and Distributed System Security Symposium\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/10.14722/ndss.2023.23131\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"Proceedings 2023 Network and Distributed System Security Symposium","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.14722/ndss.2023.23131","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 2

摘要

-我们提供二进制二进制消息格式逆向工程的全自动工具。给定一组具有相同格式的消息,该工具使用检测器集合来推断部分描述的集合,然后自动将部分描述集成为语义上有意义的描述,该描述可用于解析具有相同格式的未来数据包。作为它的集成,二进制I地狱使用一组模块化和可扩展的目标检测器,包括用于识别原子数据类型(如IEEE浮点数、时间戳和整数长度字段)的检测器;利用香农熵寻找相邻场之间的边界;以及通过搜索常见的序列化习惯用法来发现变长序列。我们对从10个二进制协议中抽取的数据包集进行了性能评估。与之前的技术相比,我们的语义驱动方法显著降低了误报率,提高了精度。对于顶级协议,我们识别字段边界的平均精度为0.69,平均召回率为0.73,平均假阳性率为0.04,在相同的数据集上显著优于其他五种最先进的协议逆向工程工具:A wre (0.18, 0.03, 0.04), F field H unter (0.68, 0.37, 0.01), N emesys (0.31, 0.44, 0.11), N etplier(0.29, 0.75, 0.22)和N etzob(0.57, 0.42, 0.03)。我们相信我们在准确性和误报率方面的改进代表了我们的目标用户最想要的:语义上有意义的描述和更少的误报。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
BinaryInferno: A Semantic-Driven Approach to Field Inference for Binary Message Formats
—We present B inary I nferno , a fully automatic tool for reverse engineering binary message formats. Given a set of mes- sages with the same format, the tool uses an ensemble of detectors to infer a collection of partial descriptions and then automatically integrates the partial descriptions into a semantically-meaningful description that can be used to parse future packets with the same format. As its ensemble, B inary I nferno uses a modular and extensible set of targeted detectors, including detectors for identifying atomic data types such as IEEE floats, timestamps, and integer length fields; for finding boundaries between adjacent fields using Shannon entropy; and for discovering variable-length sequences by searching for common serialization idioms. We evaluate B inary I nferno ’s performance on sets of packets drawn from 10 binary protocols. Our semantic-driven approach significantly decreases false positive rates and increases precision when compared to the previous state of the art. For top-level protocols we identify field boundaries with an average precision of 0.69, an average recall of 0.73, and an average false positive rate of 0.04, significantly outperforming five other state-of-the-art protocol reverse engineering tools on the same data sets: A wre (0.18, 0.03, 0.04), F ield H unter (0.68, 0.37, 0.01), N emesys (0.31, 0.44, 0.11), N etplier (0.29, 0.75, 0.22), and N etzob (0.57, 0.42, 0.03). We believe our improvements in precision and false positive rates represent what our target user most wants: semantically meaningful descriptions with fewer false positives.
求助全文
通过发布文献求助,成功后即可免费获取论文全文。 去求助
来源期刊
自引率
0.00%
发文量
0
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
确定
请完成安全验证×
copy
已复制链接
快去分享给好友吧!
我知道了
右上角分享
点击右上角分享
0
联系我们:info@booksci.cn Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。 Copyright © 2023 布克学术 All rights reserved.
京ICP备2023020795号-1
ghs 京公网安备 11010802042870号
Book学术文献互助
Book学术文献互助群
群 号:604180095
Book学术官方微信