{"title":"MultiCode: A Unified Code Analysis Framework based on Multi-type and Multi-granularity Semantic Learning","authors":"Xu Duan, Jingzheng Wu, Mengnan Du, Tianyue Luo, Mutian Yang, Yanjun Wu","doi":"10.1109/ISSREW53611.2021.00102","DOIUrl":null,"url":null,"abstract":"Code analysis is one of the common way to ensure software reliability. With the development of machine learning technology, more and more learning-based code analysis methods are proposed. However, most existing methods are aimed at specific code analysis tasks, which leads to the extra effort to implement different models for different tasks in industrial applications. In this paper, we propose MultiCode, a novel unified code analysis framework, which learns code semantic information of different types and granularities to cover the semantic information required by different tasks, so that it can be effectively adapted to multiple tasks with higher accuracy. To prove the effectiveness of MultiCode, we demonstrate and evaluate it on two common tasks: vulnerability detection and code clone detection. Experimental results show that MultiCode achieves F1-scores of 94.6%, 92.5% and 97.1% on SARD-BE, SARD-RME and OJClone datasets, which is significantly higher than the advanced existing methods.","PeriodicalId":385392,"journal":{"name":"2021 IEEE International Symposium on Software Reliability Engineering Workshops (ISSREW)","volume":"51 1 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2021-10-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"1","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"2021 IEEE International Symposium on Software Reliability Engineering Workshops (ISSREW)","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/ISSREW53611.2021.00102","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 1
Abstract
Code analysis is one of the common way to ensure software reliability. With the development of machine learning technology, more and more learning-based code analysis methods are proposed. However, most existing methods are aimed at specific code analysis tasks, which leads to the extra effort to implement different models for different tasks in industrial applications. In this paper, we propose MultiCode, a novel unified code analysis framework, which learns code semantic information of different types and granularities to cover the semantic information required by different tasks, so that it can be effectively adapted to multiple tasks with higher accuracy. To prove the effectiveness of MultiCode, we demonstrate and evaluate it on two common tasks: vulnerability detection and code clone detection. Experimental results show that MultiCode achieves F1-scores of 94.6%, 92.5% and 97.1% on SARD-BE, SARD-RME and OJClone datasets, which is significantly higher than the advanced existing methods.