Makefile什么?使用代码属性图检测没有源代码的编译器信息

Shaun R. Deaton
{"title":"Makefile什么?使用代码属性图检测没有源代码的编译器信息","authors":"Shaun R. Deaton","doi":"10.1109/TPS-ISA56441.2022.00039","DOIUrl":null,"url":null,"abstract":"Users frequently lack access to the underlying source code and build artifacts of the programs they use. Without access, uncovering information about programs, such as compiler information or security properties, becomes a difficult task. Various methods exist for static analysis testing on source code languages, but few tools work solely with the executable machine code. This paper proposes constructing the code property graph from a program’s lifted machine code to observe structural differences between other executables. We implement our approach with the Binary Ninja Intermediate Language (BNIL) and the graph2vec neural embedding framework to create embedded representations of the graphical properties of the program. Downstream applications, such as supervised machine learning, can then analyze these representations. We demonstrate the effectiveness of our approach by training a supervised random forest classifier on the embedded graphs to determine, at the function level, which compiler, clang or gcc, created the executable the function belongs to. Our results achieved an accuracy of 100% across our testing set of 25,600 samples.","PeriodicalId":427887,"journal":{"name":"2022 IEEE 4th International Conference on Trust, Privacy and Security in Intelligent Systems, and Applications (TPS-ISA)","volume":"38 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2022-12-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":"{\"title\":\"What Makefile? Detecting Compiler Information Without Source Using The Code Property Graph\",\"authors\":\"Shaun R. Deaton\",\"doi\":\"10.1109/TPS-ISA56441.2022.00039\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"Users frequently lack access to the underlying source code and build artifacts of the programs they use. Without access, uncovering information about programs, such as compiler information or security properties, becomes a difficult task. Various methods exist for static analysis testing on source code languages, but few tools work solely with the executable machine code. This paper proposes constructing the code property graph from a program’s lifted machine code to observe structural differences between other executables. We implement our approach with the Binary Ninja Intermediate Language (BNIL) and the graph2vec neural embedding framework to create embedded representations of the graphical properties of the program. Downstream applications, such as supervised machine learning, can then analyze these representations. We demonstrate the effectiveness of our approach by training a supervised random forest classifier on the embedded graphs to determine, at the function level, which compiler, clang or gcc, created the executable the function belongs to. Our results achieved an accuracy of 100% across our testing set of 25,600 samples.\",\"PeriodicalId\":427887,\"journal\":{\"name\":\"2022 IEEE 4th International Conference on Trust, Privacy and Security in Intelligent Systems, and Applications (TPS-ISA)\",\"volume\":\"38 1\",\"pages\":\"0\"},\"PeriodicalIF\":0.0000,\"publicationDate\":\"2022-12-01\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"0\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"2022 IEEE 4th International Conference on Trust, Privacy and Security in Intelligent Systems, and Applications (TPS-ISA)\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/10.1109/TPS-ISA56441.2022.00039\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"2022 IEEE 4th International Conference on Trust, Privacy and Security in Intelligent Systems, and Applications (TPS-ISA)","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/TPS-ISA56441.2022.00039","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 0

摘要

用户经常无法访问他们使用的程序的底层源代码和构建工件。如果没有访问权限,揭示有关程序的信息(如编译器信息或安全属性)将成为一项困难的任务。对源代码语言进行静态分析测试的方法有很多,但是很少有工具只对可执行的机器码工作。本文提出从程序的派生机器码构造代码属性图,以观察其他可执行文件之间的结构差异。我们使用二进制忍者中间语言(BNIL)和graph2vec神经嵌入框架来实现我们的方法,以创建程序图形属性的嵌入式表示。下游应用程序,如监督机器学习,可以分析这些表示。我们通过在嵌入图上训练有监督的随机森林分类器来证明我们方法的有效性,以确定在函数级别上,是clang还是gcc编译器创建了该函数所属的可执行文件。我们的结果在25,600个样本的测试集中实现了100%的准确性。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
What Makefile? Detecting Compiler Information Without Source Using The Code Property Graph
Users frequently lack access to the underlying source code and build artifacts of the programs they use. Without access, uncovering information about programs, such as compiler information or security properties, becomes a difficult task. Various methods exist for static analysis testing on source code languages, but few tools work solely with the executable machine code. This paper proposes constructing the code property graph from a program’s lifted machine code to observe structural differences between other executables. We implement our approach with the Binary Ninja Intermediate Language (BNIL) and the graph2vec neural embedding framework to create embedded representations of the graphical properties of the program. Downstream applications, such as supervised machine learning, can then analyze these representations. We demonstrate the effectiveness of our approach by training a supervised random forest classifier on the embedded graphs to determine, at the function level, which compiler, clang or gcc, created the executable the function belongs to. Our results achieved an accuracy of 100% across our testing set of 25,600 samples.
求助全文
通过发布文献求助,成功后即可免费获取论文全文。 去求助
来源期刊
自引率
0.00%
发文量
0
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
确定
请完成安全验证×
copy
已复制链接
快去分享给好友吧!
我知道了
右上角分享
点击右上角分享
0
联系我们:info@booksci.cn Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。 Copyright © 2023 布克学术 All rights reserved.
京ICP备2023020795号-1
ghs 京公网安备 11010802042870号
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术官方微信