组合代码分类与漏洞评级

Joseph R. Barr, Peter Shaw, F. Abu-Khzam, Sheng Yu, Heng Yin, Tyler Thatcher
{"title":"组合代码分类与漏洞评级","authors":"Joseph R. Barr, Peter Shaw, F. Abu-Khzam, Sheng Yu, Heng Yin, Tyler Thatcher","doi":"10.1109/TransAI49837.2020.00017","DOIUrl":null,"url":null,"abstract":"Empirical analysis of source code of Android Fluoride Bluetooth stack demonstrates a novel approach of classification of source code and rating for vulnerability. A workflow that combines deep learning and combinatorial techniques with a straightforward random forest regression is presented. Two kinds of embedding are used: code2vec and LSTM, resulting in a distance matrix that is interpreted as a (combinatorial) graph whose vertices represent code components, functions and methods. Cluster Editing is then applied to partition the vertex set of the graph into subsets representing nearly complete subgraphs. Finally, the vectors representing the components are used as features to model the components for vulnerability risk.","PeriodicalId":151527,"journal":{"name":"2020 Second International Conference on Transdisciplinary AI (TransAI)","volume":"1 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2020-09-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"9","resultStr":"{\"title\":\"Combinatorial Code Classification & Vulnerability Rating\",\"authors\":\"Joseph R. Barr, Peter Shaw, F. Abu-Khzam, Sheng Yu, Heng Yin, Tyler Thatcher\",\"doi\":\"10.1109/TransAI49837.2020.00017\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"Empirical analysis of source code of Android Fluoride Bluetooth stack demonstrates a novel approach of classification of source code and rating for vulnerability. A workflow that combines deep learning and combinatorial techniques with a straightforward random forest regression is presented. Two kinds of embedding are used: code2vec and LSTM, resulting in a distance matrix that is interpreted as a (combinatorial) graph whose vertices represent code components, functions and methods. Cluster Editing is then applied to partition the vertex set of the graph into subsets representing nearly complete subgraphs. Finally, the vectors representing the components are used as features to model the components for vulnerability risk.\",\"PeriodicalId\":151527,\"journal\":{\"name\":\"2020 Second International Conference on Transdisciplinary AI (TransAI)\",\"volume\":\"1 1\",\"pages\":\"0\"},\"PeriodicalIF\":0.0000,\"publicationDate\":\"2020-09-01\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"9\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"2020 Second International Conference on Transdisciplinary AI (TransAI)\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/10.1109/TransAI49837.2020.00017\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"2020 Second International Conference on Transdisciplinary AI (TransAI)","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/TransAI49837.2020.00017","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 9

摘要

通过对Android氟化物蓝牙堆栈源代码的实证分析,提出了一种新的源代码分类和漏洞评级方法。提出了一种将深度学习和组合技术与简单的随机森林回归相结合的工作流。使用了两种嵌入:code2vec和LSTM,产生一个距离矩阵,该矩阵被解释为一个(组合)图,其顶点表示代码组件、函数和方法。然后应用聚类编辑将图的顶点集划分为代表几乎完全子图的子集。最后,利用表示组件的向量作为特征对组件进行脆弱性风险建模。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
Combinatorial Code Classification & Vulnerability Rating
Empirical analysis of source code of Android Fluoride Bluetooth stack demonstrates a novel approach of classification of source code and rating for vulnerability. A workflow that combines deep learning and combinatorial techniques with a straightforward random forest regression is presented. Two kinds of embedding are used: code2vec and LSTM, resulting in a distance matrix that is interpreted as a (combinatorial) graph whose vertices represent code components, functions and methods. Cluster Editing is then applied to partition the vertex set of the graph into subsets representing nearly complete subgraphs. Finally, the vectors representing the components are used as features to model the components for vulnerability risk.
求助全文
通过发布文献求助,成功后即可免费获取论文全文。 去求助
来源期刊
自引率
0.00%
发文量
0
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
确定
请完成安全验证×
copy
已复制链接
快去分享给好友吧!
我知道了
右上角分享
点击右上角分享
0
联系我们:info@booksci.cn Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。 Copyright © 2023 布克学术 All rights reserved.
京ICP备2023020795号-1
ghs 京公网安备 11010802042870号
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术官方微信