{"title":"Automatic Detection for Reused Open Source Codes Based on Similarity Identification of Software Networks","authors":"Tao Shi, Liang Yan, Haoran Guo, J. Ai","doi":"10.1109/DSA.2019.00042","DOIUrl":null,"url":null,"abstract":"Software plays an increasingly important role in today's world. With the advent of open-source software, an increasing number of developers begin to focus on and apply open-source software as a basic tool for their program development. However, at the same time, introducing open-source software into their own software introduces various types of defects and disadvantages. These unknown risks may cause incalculable economic loss aud credit crises if they were to be exploited in the future. Therefore, it is an important and urgent problem to detect the components of open-source software that may be reused when outsourcing software. To help detecting opensource software components in large-scale software projects, this paper proposes automatic identification technology for subnetworks with similar structural characteristics. This technology is based on node role classification, node similarity matching, and similar subnetwork search. This subject applies complex network technology to the comparison of software networks. In contrast to traditional code detection technology, this study does not constrain the text information of the software's source code. Considering that the basic skeleton structure of an application, in the processes of code reuse and features, remains the same, its network structure is used instead of its software structure to avoid problems such as poor detection results of similar codes as a result of text modification. This technology starts from the software features and the structure of software network.","PeriodicalId":342719,"journal":{"name":"2019 6th International Conference on Dependable Systems and Their Applications (DSA)","volume":"192 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2020-01-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"2019 6th International Conference on Dependable Systems and Their Applications (DSA)","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/DSA.2019.00042","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 0
Abstract
Software plays an increasingly important role in today's world. With the advent of open-source software, an increasing number of developers begin to focus on and apply open-source software as a basic tool for their program development. However, at the same time, introducing open-source software into their own software introduces various types of defects and disadvantages. These unknown risks may cause incalculable economic loss aud credit crises if they were to be exploited in the future. Therefore, it is an important and urgent problem to detect the components of open-source software that may be reused when outsourcing software. To help detecting opensource software components in large-scale software projects, this paper proposes automatic identification technology for subnetworks with similar structural characteristics. This technology is based on node role classification, node similarity matching, and similar subnetwork search. This subject applies complex network technology to the comparison of software networks. In contrast to traditional code detection technology, this study does not constrain the text information of the software's source code. Considering that the basic skeleton structure of an application, in the processes of code reuse and features, remains the same, its network structure is used instead of its software structure to avoid problems such as poor detection results of similar codes as a result of text modification. This technology starts from the software features and the structure of software network.