关于使用机器学习进行源代码审查的调查

Wang Xiaomeng, Zhang Tao, Xin Wei, Hou Changyu
{"title":"关于使用机器学习进行源代码审查的调查","authors":"Wang Xiaomeng, Zhang Tao, Xin Wei, Hou Changyu","doi":"10.1109/ICISE.2018.00018","DOIUrl":null,"url":null,"abstract":"Source code review constrains software system security sufficiently. Scalability and precision are of importance for the deployment of code review tools. However, traditional tools can only detect some security flaws automatically with high false positive and false negative by tedious reviewing large-scale source code. Various flaws and vulnerabilities show specific characteristic in source code. Machine learning systems founded feature matrixes of source code as input, including variables, functions and files, generating ad-hoc label by distinguish or generation methodologies to review source code automatically and intelligently. Source code, whatever the programming language, is text information in nature. Both secure and vulnerable feature can be curved from source code. Fortunately, a variety of machine learning approaches have been developed to learn and detect flaws and vulnerabilities in intelligent source code security review. Combination of code semantic and syntactic feature contribute to the optimation of false positive and false negative during source code review. In this paper, we give the review of literature related to intelligent source code security review using machine learning methods. It illustrate the primary evidence of approaching ML in source code security review. We believe machine learning and its branches will become out-standing in source code review.","PeriodicalId":207897,"journal":{"name":"2018 3rd International Conference on Information Systems Engineering (ICISE)","volume":"45 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2018-05-04","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"3","resultStr":"{\"title\":\"A Survey on Source Code Review Using Machine Learning\",\"authors\":\"Wang Xiaomeng, Zhang Tao, Xin Wei, Hou Changyu\",\"doi\":\"10.1109/ICISE.2018.00018\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"Source code review constrains software system security sufficiently. Scalability and precision are of importance for the deployment of code review tools. However, traditional tools can only detect some security flaws automatically with high false positive and false negative by tedious reviewing large-scale source code. Various flaws and vulnerabilities show specific characteristic in source code. Machine learning systems founded feature matrixes of source code as input, including variables, functions and files, generating ad-hoc label by distinguish or generation methodologies to review source code automatically and intelligently. Source code, whatever the programming language, is text information in nature. Both secure and vulnerable feature can be curved from source code. Fortunately, a variety of machine learning approaches have been developed to learn and detect flaws and vulnerabilities in intelligent source code security review. Combination of code semantic and syntactic feature contribute to the optimation of false positive and false negative during source code review. In this paper, we give the review of literature related to intelligent source code security review using machine learning methods. It illustrate the primary evidence of approaching ML in source code security review. We believe machine learning and its branches will become out-standing in source code review.\",\"PeriodicalId\":207897,\"journal\":{\"name\":\"2018 3rd International Conference on Information Systems Engineering (ICISE)\",\"volume\":\"45 1\",\"pages\":\"0\"},\"PeriodicalIF\":0.0000,\"publicationDate\":\"2018-05-04\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"3\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"2018 3rd International Conference on Information Systems Engineering (ICISE)\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/10.1109/ICISE.2018.00018\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"2018 3rd International Conference on Information Systems Engineering (ICISE)","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/ICISE.2018.00018","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 3

摘要

源代码审查充分约束了软件系统的安全性。可伸缩性和精确性对于代码审查工具的部署非常重要。然而,传统的工具只能通过繁琐的审查大规模源代码,自动检测出一些高假阳性和假阴性的安全漏洞。各种缺陷和漏洞在源代码中表现出特定的特征。机器学习系统建立源代码的特征矩阵作为输入,包括变量、函数和文件,通过区分或生成方法生成特设标签,自动智能地审查源代码。不管编程语言是什么,源代码本质上都是文本信息。安全和易受攻击的特性都可以从源代码中得到。幸运的是,已经开发了各种机器学习方法来学习和检测智能源代码安全审查中的缺陷和漏洞。代码语义特征和语法特征的结合有助于优化源代码审查过程中的假阳性和假阴性。在本文中,我们回顾了与使用机器学习方法的智能源代码安全审查相关的文献。它说明了在源代码安全审查中使用ML的主要证据。我们相信机器学习及其分支将在源代码审查中脱颖而出。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
A Survey on Source Code Review Using Machine Learning
Source code review constrains software system security sufficiently. Scalability and precision are of importance for the deployment of code review tools. However, traditional tools can only detect some security flaws automatically with high false positive and false negative by tedious reviewing large-scale source code. Various flaws and vulnerabilities show specific characteristic in source code. Machine learning systems founded feature matrixes of source code as input, including variables, functions and files, generating ad-hoc label by distinguish or generation methodologies to review source code automatically and intelligently. Source code, whatever the programming language, is text information in nature. Both secure and vulnerable feature can be curved from source code. Fortunately, a variety of machine learning approaches have been developed to learn and detect flaws and vulnerabilities in intelligent source code security review. Combination of code semantic and syntactic feature contribute to the optimation of false positive and false negative during source code review. In this paper, we give the review of literature related to intelligent source code security review using machine learning methods. It illustrate the primary evidence of approaching ML in source code security review. We believe machine learning and its branches will become out-standing in source code review.
求助全文
通过发布文献求助,成功后即可免费获取论文全文。 去求助
来源期刊
自引率
0.00%
发文量
0
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
确定
请完成安全验证×
copy
已复制链接
快去分享给好友吧!
我知道了
右上角分享
点击右上角分享
0
联系我们:info@booksci.cn Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。 Copyright © 2023 布克学术 All rights reserved.
京ICP备2023020795号-1
ghs 京公网安备 11010802042870号
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术官方微信