CLASC: A Changelog Based Automatic Code Source Classification Method for Operating System Packages

Yi Ren, Jianbo Guan, Jun Ma, Yusong Tan, Qingbo Wu, Y. Ding
{"title":"CLASC: A Changelog Based Automatic Code Source Classification Method for Operating System Packages","authors":"Yi Ren, Jianbo Guan, Jun Ma, Yusong Tan, Qingbo Wu, Y. Ding","doi":"10.1109/APSEC48747.2019.00058","DOIUrl":null,"url":null,"abstract":"Open source represents an important way in which today's software is developed. The adoption of open source software continues to accelerate because of the great potential it offers, such as productivity improvement, cost savings and quicker innovation. While the complexity and the size of software composition grow, it becomes difficult to effectively scan and track the code source, especially for software with tremendous scale of code, such as operating systems. So far, existing work on open source components mainly focus on how to mitigate potential license incompliance, to reduce potential security risks introduced by open source vulnerabilities, and to detect and match open source components in the code. To ensure code traceability and manageability for large scale mixed-source operating system, we believe it is beneficial to automatically distinguish sources of the system code in the granularity of software packages and manage them separately. However, according to the literature, there is a lack of relevant work in this area. In this paper, we first classify the packages into three categories in terms of code source from the perspective of OS developers and maintainers. Then we propose CLASC, an efficient code source classification algorithm. With the capability of package info extraction and analysis, CLASC can classify software packages into the defined categories according to their changelog info. And we design and implement KyAnalyzer, a Web-based package management and code source analysis platform. It provides automatic code source analyzing services and is capable of managing OS packages differentially according to their different categories of code source with CLASC incorporated as a component of it. Experimental results show the correctness and efficiency of the Web-enabled package source classifier.","PeriodicalId":325642,"journal":{"name":"2019 26th Asia-Pacific Software Engineering Conference (APSEC)","volume":"49 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2019-12-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"2019 26th Asia-Pacific Software Engineering Conference (APSEC)","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/APSEC48747.2019.00058","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 0

Abstract

Open source represents an important way in which today's software is developed. The adoption of open source software continues to accelerate because of the great potential it offers, such as productivity improvement, cost savings and quicker innovation. While the complexity and the size of software composition grow, it becomes difficult to effectively scan and track the code source, especially for software with tremendous scale of code, such as operating systems. So far, existing work on open source components mainly focus on how to mitigate potential license incompliance, to reduce potential security risks introduced by open source vulnerabilities, and to detect and match open source components in the code. To ensure code traceability and manageability for large scale mixed-source operating system, we believe it is beneficial to automatically distinguish sources of the system code in the granularity of software packages and manage them separately. However, according to the literature, there is a lack of relevant work in this area. In this paper, we first classify the packages into three categories in terms of code source from the perspective of OS developers and maintainers. Then we propose CLASC, an efficient code source classification algorithm. With the capability of package info extraction and analysis, CLASC can classify software packages into the defined categories according to their changelog info. And we design and implement KyAnalyzer, a Web-based package management and code source analysis platform. It provides automatic code source analyzing services and is capable of managing OS packages differentially according to their different categories of code source with CLASC incorporated as a component of it. Experimental results show the correctness and efficiency of the Web-enabled package source classifier.
基于变更日志的操作系统包源代码自动分类方法
开源代表了当今软件开发的一种重要方式。开源软件的采用继续加速,因为它提供了巨大的潜力,比如生产力的提高、成本的节约和更快的创新。随着软件组成的复杂性和规模的增长,有效地扫描和跟踪代码源变得越来越困难,特别是对于代码规模巨大的软件,如操作系统。到目前为止,关于开源组件的现有工作主要集中在如何减轻潜在的许可证不合规,减少开源漏洞带来的潜在安全风险,以及检测和匹配代码中的开源组件。为了保证大规模混合源操作系统的代码可追溯性和可管理性,我们认为在软件包粒度上自动区分系统代码的来源并对其进行单独管理是有益的。然而,根据文献,在这方面缺乏相关的工作。在本文中,我们首先从操作系统开发人员和维护人员的角度,根据代码源代码将这些包分为三类。然后,我们提出了一种高效的代码源分类算法class。class具有包信息提取和分析能力,可以根据软件包的变更日志信息将软件包划分为已定义的类别。设计并实现了基于web的包管理和源代码分析平台KyAnalyzer。它提供自动代码源分析服务,能够根据不同类别的代码源对操作系统包进行不同的管理,并将class作为其组件。实验结果表明了该方法的正确性和有效性。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
求助全文
约1分钟内获得全文 求助全文
来源期刊
自引率
0.00%
发文量
0
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
确定
请完成安全验证×
copy
已复制链接
快去分享给好友吧!
我知道了
右上角分享
点击右上角分享
0
联系我们:info@booksci.cn Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。 Copyright © 2023 布克学术 All rights reserved.
京ICP备2023020795号-1
ghs 京公网安备 11010802042870号
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术官方微信